Tight Memory-Sample Tradeoffs for Learning Parity with Noise
Project Description
The problem of learning under memory-constraints has important applications to machine learning and cryptography. In our project, we seek to prove lower bounds on memory-sample tradeoffs for the learning parity with noise (LPN) problem. The goal is to improve upon the bounds proven in [GKLR'21].
Acknowledgements
Thank you to my mentor Professor Sumegha Garg for guidance in this project. Our meetings were invaluable and illuminated how intuition is built and utilized both in this project and in computer science theory research more broadly. I look forward to continuing this project beyond the summer REU session.
This work was carried out while I was a participant in the 2024 DIMACS REU program at Rutgers University, supported by NSF grant CNS-2150186.
Final Presentation:
Weekly Log
Week 1:
I met with Professor Garg at the start of the program. We had previously discussed the problem statement over Zoom, and she answered a few questions I had about related works. Then she gave me more background on relevant techniques in past work, namely branching programs and extractors. We will discuss potential approaches to proving a tighter lower bound next week.
Following our meeting for the rest of the week, I spent most of my time reading the main proof of Memory-Sample Lower Bounds for Learning Parity with Noise [GKLR21], which is the paper we are jumping off from with the best memory-sample lower bound on LPN so far. I also refreshed lecture notes on information theory, which will be useful in determining how much progress an LPN learner makes toward determining the LPN secret. Finally, I created my introductory presentation.
Week 2:
This week began with practicing/giving my introductory presentation on the problem statement. It gave me the opportunity to really understand and make concise what the key ideas and techniques were in the prior work (i.e. branching programs and sufficiently low bias in the submatrices for the extractor representation of the learning problem).
Following that, I met with Prof. Garg twice and we further discussed why the extractor-based techniques in the previous work could not extend to the tighter lower bound. I gained a much better understanding of the knowledge surrounding parity learning in general. We will continue to model the learning problem as a branching program, as that is a standard model for memory-bounded programs, but will need to redefine what it means for an edge (which now corresponds to many samples rather than just one) or vertex to be "bad" or give too much information about the LPN secret. Now that an edge represents many samples, we have to consider patterns or groups of samples that may give information, rather than just one sample at a time. We came up with a possible definition for a generally bad edge, for which need to argue that there are negligible, and now the task for next week is to define a bad edge per vertex.
Week 3:
Now we are looking to define a bad edge given that we have landed on a particular vertex. The main task here is to determine what a reasonable bound for the ratio between distribution of the edge e given vertex v and the distribution of e in the whole branching program. I spent most of the week trying edges and belief distributions with different properties (repeated samples, r bits of x being known, etc) to try to gain some intuition about the quantity, and we continued this in our meeting on Friday.
If our belief distribution Px|v is uniform, this quantity will be 1. Then we want to compute the bias of this distribution from 1 as the belief distribution changes. I tried to estimate/get intuition for the bias as H(Px|v) varies farther from uniform. We realized that there are some patterns in x that would lead to edges with high bias with a significant likelihood, so in our meeting, we discussed placing additional restrictions on the error parameter epsilon and the edges themselves in hopes of getting around this. I also attended the CS theory reading group and listened to a paper presentation from Vihan Shah on approximation algorithms for maximum independent sets with an oracle that answers vertex membership queries.
Week 4:
After finding there are large classes of edges with relatively high average bias across LPN secret x values, we decided to attempt to bound correlations between pairs of edges on the same path in the branching program. Because this correlation between pairs of edges is a more complicated quantity, it has been difficult to get any intuition for it. This week, I have been using Python code to calculate these quantities for random pairs of edges and more structured pairs of edges to see if we can come up with tail bounds on the CDF of this correlation value.
Week 5:
The tails of the CDF plots of correlations between pairs of edges I produced ended up having tails that were far too shallow to have a meaningful or useful bound. In our meeting, we opted to instead prove a weaker tradeoff, with linear rather than exponential samples and introduced an additional parameter to hopefully make the proof easier.
Week 6:
This week we continued to work on proving the weaker memory-sample tradeoff, and tried to prove some intermediate claims. That is, we tried to prove an upper bound on the bias for matrix M(e,x) representing the learning problem on a specific edge for all but a small fraction of edges. also presented at the CS theory reading seminar this week on a topic in theoretical cryptography that interested me: fully homomorphic encryption, specifically the 2013 Gentry-Sahai-Waters construction.
Week 7:
This week, we tried to work out the polynomial tradeoff and ran into some obstacles calculating correlation between belief distributions on two edges with a close relationship. We are considering a new model, looking at a simpler problem from a communication complexity angle that would help us prove a polynomial tradeoff. Unfortunately, I got sick Wednesday evening, so I was not able to work on the project this week as much as I would have liked.
Week 8:
I presented the project on Wednesday and spent much of Monday and Tuesday preparing for this, along with some preliminary writing for the final report. It was a nice opportunity to get my thoughts in order and summarize the work we'd done this summer. It was somewhat difficult to find a way to discuss our conjectures and obstacles without much background on the techniques in 9 minutes. At the end of the week, we also realized that the polynomial tradeoff may still work, so we're revisiting these correlation quantities.
Week 9:
This week I traveled to Prague with some of the other REU participants, hosted by Charles University. During the morning/early afternoon we attended some lectures and problem solving sessions from Charles professors on the topics of spectral graph theory, discrete geometry, and perfect graphs, along with student presentations on game theory related topics and geometric graphs. I have not come into contact with combinatorics or graph theory much since high school, and ended up enjoying the talks far more than I expected as a computer science person! The problem solving sessions were fun as well, and a nice chance to discuss problems with the other REU participants, which I unfortunately was not able to do as much as I liked during the program. In the evenings, beyond exploring the city, I wrote up the final report for the project, along with spending a bit of time verifying calculations on the correlation quantities.