DIMACS

General Information

Student: Benjamín Benčík
Office: CoRE 444
School: Faculty of Mathematics and Physics, Charles University
Contact: bencikben@gmail.com
Project: Memory-query tradeoffs for property testing
Mentor: Sumegha Garg

I am part of a group of students from Charles University that includes Todor Antic, Adam Dzavoronok, Jelena Glisic, Robert Jaworski, Tymur Kotkov, Sofia Kotsiubynska, Julia Krizanova, Volodymyr Kuznietsov, Tymofii Reizin and Jakub Sosovicka, Filip Uradnik. Patrik Zavoral.


About me: I study Computer Science and I am interested in cryptography and theoretical computer science.

Project Description

Property testing is a natural notion of approximation for decision problems, where given a property (or decision), the task is to distinguish whether a given instance has this property or is "far" from any instance having the property. Such tasks are widely used in various areas of TCS, such as coding theory, hardness of approximation and graph algorithms. In this project, we will focus on property testing on graphs, and explore the impact of memory constraints on the complexity of property testing. One of main measures of complexity for a property testing algorithm is its query complexity, which is the maximum number of input elements queried to determine whether the graph satisfies a given property.


Research Log

Week 1

Before the start of the project I have read Property testing by Dana Ron, which offers a comprehensive summary of the field. In the first week project I have looked at the results discovered in the field and tried to estimate memory complexity for the existing property testing algorithms.

bipartitness[1]: \(\widetilde{\mathcal{O}}(\epsilon^{-2})\) k-colorability[1]: \(poly(\widetilde{\mathcal{O}}(k^4 / \epsilon^{6}))\) p-clique[1]: \( \widetilde{\mathcal{O}}(p / \epsilon^{2}) \) connectivity[2]: \( \mathcal{O}\left(\frac{d}{M/N \cdot \epsilon}\right) \) eulerian[2]: \( \widetilde{\mathcal{O}}\left(\frac{d}{M/N \cdot \epsilon/2}\right) \)
The notation with ~ hide logarithmic factors. The memory complexity is given in terms of distance parameter \( \epsilon \), which is the maximum distance from the property.

We have not yet decided on the exact objective of the project, but next week I will decide which of the properties might make the most sense to study in terms of the memory complexity. The project will have mainly exploratory basis and we I focus on constructing lower bounds rather than designing a specific algorithm.

Week 2

Most of the properties I have looked at had trivial lower bounds for memory complexity. We have decided to study algorithms that decide if a graph has: triangle, \(k\)-clique or \(k\)-star. So far the bounds that I have achieved to show are only trivial.

Moreover I have read additional resources on property testing and techniques used in the field. I have looked at first three lectures of Information theory in computer science and the streaming model for graphs described in Data Streaming Algorithms.

Week 3

I have looked for known tester of the selected properties. My main focus was on the triangle-freeness property tester utilizing the Szemerédi’s Regularity Lemma. I have tried to come up with indistinguishable distributions for the lower bound on testing star-freeness but did not show anything better than the trivial bounds. With my mentor we have made the problem more specific and I started working testers for the star-freeness property on more specific graph families.

Week 4

This week my main focus was on the star-freeness property. I have tried to construct 2 testers where one optimized the query complexity and the other memory complexity. The aim is to be able to show some trade-off and prove lower bound for query complexity \( \times \) memory complexity. To be more specific I have tried to distinguish the following families:

I was able to find algorith with complexities \( \mathcal{O}(n^2), \mathcal{O}(\log{n}) \) and a second one with complexities \( \mathcal{O}(\log{n}), \mathcal{O}(n\log{n}) \) and prove their validity. However, my mentor eventually came up with an algorithm that achieved the optimum in both cases \( \mathcal{O}(n), \mathcal{O}(\log{n}) \) meaning that there was no tradeoff. I provided thorough analysis and proves in a technial writeup.

Week 5

I have studied a problem that tries to distinguish random graphs with degree bounds \(1/4n\) and \(3/4n\). In the design of various testers the gap between "memory optimal" and "query optimal" aproaches was small. That being said I did not find the memory size to make a huge trade-off for the property testing problems that I have studied. Usually the memory effiecient algorithm was counting and could get aways with constant memory. For this the section on Aproximating average degree from [3] was very helpful. On other hand I was struggling to get a significantly more efficient for large memory. I have looked at other approaches, I found out that graph adjacency matrix can be subjected to spectral analysis which offers some insight about the graph however the query complexity for approximating eigenvectors is expensive as well.

I have tried to consider different models. My mentor suggested looking at a semi-randomness model where tester can query a vertex \(v\) and recieves a random edge connect to \(v\). In this setting I have tried to distinguish two families:

Week 6

Week 7


Resources

  1. Property Testing and Its Connection to Learning and Approximation 98 - Oded Goldreich, Shafi Goldwasser, Dana Ron
  2. Property Testing in Bounded Degree Graphs 97 - Oded Goldreich, Dana Ron
  3. Introduction to Property Testing 2017 - Oded Goldreich

Acknowledgements

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 823748.