Student: | Benjamín Benčík |
---|---|
Office: | CoRE 444 |
School: | Faculty of Mathematics and Physics, Charles University |
Contact: | bencikben@gmail.com |
Project: | Memory-query tradeoffs for property testing |
Mentor: | Sumegha Garg |
I am part of a group of students from Charles University that includes Todor Antic, Adam Dzavoronok, Jelena Glisic, Robert Jaworski, Tymur Kotkov, Sofia Kotsiubynska, Julia Krizanova, Volodymyr Kuznietsov, Tymofii Reizin and Jakub Sosovicka, Filip Uradnik. Patrik Zavoral.
Property testing is a natural notion of approximation for decision problems, where given a property (or decision), the task is to distinguish whether a given instance has this property or is "far" from any instance having the property. Such tasks are widely used in various areas of TCS, such as coding theory, hardness of approximation and graph algorithms. In this project, we will focus on property testing on graphs, and explore the impact of memory constraints on the complexity of property testing. One of main measures of complexity for a property testing algorithm is its query complexity, which is the maximum number of input elements queried to determine whether the graph satisfies a given property.
Before the start of the project I have read Property testing by Dana Ron, which offers a comprehensive summary of the field. In the first week project I have looked at the results discovered in the field and tried to estimate memory complexity for the existing property testing algorithms.
bipartitness[1]: \(\widetilde{\mathcal{O}}(\epsilon^{-2})\) | k-colorability[1]: \(poly(\widetilde{\mathcal{O}}(k^4 / \epsilon^{6}))\) | p-clique[1]: \( \widetilde{\mathcal{O}}(p / \epsilon^{2}) \) | connectivity[2]: \( \mathcal{O}\left(\frac{d}{M/N \cdot \epsilon}\right) \) | eulerian[2]: \( \widetilde{\mathcal{O}}\left(\frac{d}{M/N \cdot \epsilon/2}\right) \) |
We have not yet decided on the exact objective of the project, but next week I will decide which of the properties might make the most sense to study in terms of the memory complexity. The project will have mainly exploratory basis and we I focus on constructing lower bounds rather than designing a specific algorithm.
Most of the properties I have looked at had trivial lower bounds for memory complexity. We have decided to study algorithms that decide if a graph has: triangle, \(k\)-clique or \(k\)-star. So far the bounds that I have achieved to show are only trivial.
Moreover I have read additional resources on property testing and techniques used in the field. I have looked at first three lectures of Information theory in computer science and the streaming model for graphs described in Data Streaming Algorithms.
This week my main focus was on the star-freeness property. I have tried to construct 2 testers where one optimized the query complexity and the other memory complexity. The aim is to be able to show some trade-off and prove lower bound for query complexity \( \times \) memory complexity. To be more specific I have tried to distinguish the following families:
I was able to find algorith with complexities \( \mathcal{O}(n^2), \mathcal{O}(\log{n}) \) and a second one with complexities \( \mathcal{O}(\log{n}), \mathcal{O}(n\log{n}) \) and prove their validity. However, my mentor eventually came up with an algorithm that achieved the optimum in both cases \( \mathcal{O}(n), \mathcal{O}(\log{n}) \) meaning that there was no tradeoff. I provided thorough analysis and proves in a technial writeup.
I have studied a problem that tries to distinguish random graphs with degree bounds \(1/4n\) and \(3/4n\). In the design of various testers the gap between "memory optimal" and "query optimal" aproaches was small. That being said I did not find the memory size to make a huge trade-off for the property testing problems that I have studied. Usually the memory effiecient algorithm was counting and could get aways with constant memory. For this the section on Aproximating average degree from [3] was very helpful. On other hand I was struggling to get a significantly more efficient for large memory.\ I have looked at other approaches, I found out that graph adjacency matrix can be subjected to spectral analysis which offers some insight about the graph however the query complexity for approximating eigenvectors is expensive as well.
I have tried to consider different models. My mentor suggested looking at a semi-randomness model where tester can query a vertex \(v\) and recieves a random edge connect to \(v\). In this setting I have tried to distinguish two families:
I came up with algorithm for the graph families that I have constructed the previous week but it mainly relied on the fact that the cliques were isolated. This instance of the problem is relatively easy and it can be computed in \(\sqrt{n}\) instances. We have tried to design a harder problem a specified graph families where the problem is to distinguish clique of size \(n/4\). Observation to this problem is that sampling into a random sets is probabily not going to make a valid tester since, it is not propbable that we will obtain an edge between those sets.
We spent the week mostly by wrapping our notes up, and by preparing the final presentation.
During the rest of the sumemr I was studying for the final graduation test then I was moving to Technical University of Munich. Currectly we are working on \(n/3\)-clique tester for the semi-randomness model. Querying \(v\) gives another vertex \(u\) and a bit indicating if there exists an edge \((u, v)\). We managed to come up with algorithm for the property tester and this week I have been writing the analysis for this algorithm. We constructed stronger algorithm that can have access to the whole neighbourood and then showed that the quantity computed in our algorithm is a correct approximation of the idead quantity. We have also shown that the quantity is a good approximation of the number of triangles in the graph. The algorithm is based on the idea of counting triangles in the graph and then using the fact that the number of triangles is a good approximation of the number of \(n/3\)-cliques. Right now i am reading the paper on Streaming Lower Bounds for Approximating MAX-CUT, which might help us the establish the lower bounds for the property tester.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 823748.