DIMACS
DIMACS REU 2023

General Information

me
Student: Sahil Kuchlous
School: Harvard University
Email: sahilkuchlous at college dot harvard dot edu
Project: Algorithms for Streaming Tournaments
Mentor: Dr. Prantar Ghosh

Project Description

A tournament is a digraph (directed graph) where every pair of vertices shares exactly one directed edge. We are interested in studying tournaments in the streaming model where the input graph is built up by a sequence of edges, and we need to maintain a summary of it using low memory so as to answer certains questions about the graph. Sometimes we are allowed multiple passes over the edge stream to reach a solution. A recent paper [Chakrabarti, Ghosh, McGregor, Vorotnikova, SODA2020] shows that in streaming, many classical problems turn out to be hard for general digraphs, i.e., they are provably unsolvable without storing almost the entire graph; on the other hand, they show that these problems are tractable when the digraph is a tournament. They design a collection of efficient algorithms for these problems, but hardly rule out better ones. We want to investigate whether we can design more efficient streaming algorithms for classical problems on tournaments, or whether we can prove optimality of existing algorithms. One concrete problem is finding the minimum Feedback Arc Set (FAS). An FAS in a digraph is a set of arcs (directed edges) whose deletion effectively removes all directed cycles. Since this set can have very large size, a deterrent for the streaming model, we want to solve an equivalent version of the problem where we need to output only an ordering of the nodes such that the number of back edges (edges going backwards in the ordering) is minimized (the back edges form an FAS of the graph). We can also ask the question of simply finding the size of the minimum FAS (FAS-size). While prior work has designed algorithms that give approximate solutions to FAS and FAS-size for tournaments, neither did they rule out exact algorithms nor did they prove that the same approximations cannot be achieved in fewer passes or using smaller space. In fact, even the existing lower bounds for FAS in general digraphs only rule out multiplicative approximations; whether we can achieve estimates within a reasonably small additive factor is unknown and can be an interesting topic of study.


Weekly Log

Week 1:
We spent most of the week reading through existing literature in the area. [CGMV20] and [BJW21] introduce multi-pass approximation algorithms for FAS on tournaments, and [GO16] introduces a communication task that helps prove strong multi-pass lower bounds for problems like acyclicity in general digraphs. We also worked on extending a few of the results from [CGMV20], including a multi-pass algorithm for acyclicity on tournaments and an easier lower bound for acyclicity on tournaments. Finally, we identified a few more problems on tournaments that may be interesting to investigate, including s-t connectivity. At the end of the week we worked on the introductory presentation linked below.
Week 2:
We continued reading [CGMV20], focusing on random-order sink finding (section 4.2) and the 3-approximation for FAS-T in polynomial time (section 3.2). We worked on some new results for sink finding lower bounds, and started writing our new results as a formal report. We continued thinking about the FAS problem, and came up with a 2-pass algorithm that distinguishes between FAS 0, 1 and greater than 1. We suspect this can be reduced to a single pass. We discussed applications of the hidden pointer chasing communication problem to finding lower bounds, and the issues with applying it to some problems we care about. Finally, we continued to think about ways to establish lower bounds on tournament streaming.
Week 3:
A significant portion of this week was sepnt attending the DIMACS Workshop on Modern Techniques in Graph Algorithms, which had a lot of interesting tutorials and presentations. The ones most relevant to my project included Prof. Assadi's tutorial on applications of Ruzsa-Szemeredi graphs to streaming lower bounds, Prof. Khanna's presentation on palette sparsification for graph coloring and Janani Sundaresan's presentation on random-order streaming lower bounds. I also worked with George, another REU student, on compiling the open problems presented at the workshop. For our research project, we continued to think about FAS lower bounds, and came up with a few concrete approaches that may help construct a reduction. These included considering the deterministic communication setting, insert/delete streams, and the augmented INDEX communication problem.
Week 4:
We finally made some progress on the FAS lower bound for tournaments, and have an interesting construction. Unfortunately, we are having some trouble proving that it works, and spent most of the week thinking about and discussing ways to do so. It seems possible that it could be proved using brute casework, but we're hoping for something a bit more elegant. We also went through [BJW21] in more detail, focusing particularly on the improvements it makes over the polynomial-time 3-approximation given in [CGMV20]. I plan to present this paper some time next week.
Week 5:
We worked through more of the FAS lower bound, and came up with some new lower bounds for s-t reachability and s-t distance. Combined with the results from earlier, this gives us a pretty strong foundtation for the report, and I plan to continue writing up the formal proofs in the report next week. I didn't have time to present [BJW21] this week, but we plan to do that soon. We had a couple of nice talks this week, including one on Ramsey theory and another on proof by example. Both were very well delivered. I also got a chance to visit the Zimmerli Art Museum and meet some of the REU students from the Baruch program.
Week 6:
We made some progress on FAS ordering lower bounds (as opposed to FAS distance), and were able to get some upper bounds for s-t reachability, although it turns out a very similar result already existed in [BJW21]. We now have strong upper bounds for s-t reachability and strong lower bounds for s-t connectivity, so the goal is to tighten these from the opposite direction. We also spent some time thinking about lower bounds to rule out FAS approximations, but didn't make much progress. In the other direction, we considered whether it was possible to break the 5-approximation barrier for single-pass FAS in polynomial time, but this also seems pretty hard. Outside of research, I kept working on the report, and attended a nice panel on graduate school.
Week 7:
We continued thinking about the 5-approximation barrier for single-pass FAS, and were able to identify the bottleneck case after studying [CFR08]. However, we still haven't found a way to beat the 5-approximation for this case. We also spent some time working on s-t reachability and connectivity, but didn't make much progress. Outside of research, I presented some single-pass FAS algorithms to the theory reading group, and we went on a trip to IBM research.
Week 8:
We managed to get a stronger FAS approximation lower bound, but there is still a fairly large gap between this and the upper bound. It seems unclear how to get more out of our current approach, so we're considering stronger communication problems to reduce from. We also identified a few gaps in the deterministic approximation algorithms for FAS size, since the 5-approximation for FAS ordering doesn't seem to help. For s-t connectivity, algorithms for adversarial streams seem hard, so we're thinking about random-order streams first. Outside of research, I finished my final presentation.
Week 9:
We came up for an idea to improve the FAS approximation lower bound, but it is more technical and involves showing a bound on a new communication problem. We suspect it shouldn't be too hard, and I've been working on figuring this out. We've also been thinking about improving the approximation upper bound from CGMV to get closer to the lower bound we've established. We have a couple of ideas for how to achive this, but its not clear if any of them work yet. Since its the last week, I've also been working on my final report. Although the program ends this week, we're planning to continue working on the project, since we still have some ideas we haven't explored yet.

Presentations


References