General Information
Project Description
Robust Learning is the study of statistical techniques and Machine Learning algorithms that are able to recover properties of a distribution even in the face of noise.
For my project I will be studying computationally efficient methods that are resistant to arbitrary adversarial noise (restricted to a fraction of the dataset), and how
we can extend current techniques to more unsupervised machine learning problems.
Weekly Log
- Week 1:
- For the first week I met with my mentor Dr. Pranjal Awasthi and discussed the field, along with several potential research directions. I then mostly just read up on some background knowledge and prepared my introduction presentation.
- Week 2:
- For the second week I read two papers on Robust Mean estimation in the presence of adversarial noise.
The first was this paper which introduced me to the formal specifications of robust learning, and provided multiple approaches where to recover parameters of specific distributions, with error independent of the dimension (a big result at the time).
The second was this paper which built on this and provided a much more general sufficiency requirement for the ability to robustly recover distributions with adversarial noise, which they call "resilience".
I also did some reading on convex optimization since I was missing some background knowledge required for some of the papers.
- Week 3:
- This week we discussed a potential approach for improving the analysis in one of these papers, which seems promising. I tried modifying some of the proofs in the papers to see if I could find some intuition behind some of their ideas.
Week 4:
We were looking at trying to improve the runtime of one of the methods from previous papers on mean estimation, and although I was able to find an example where the algorithm described by the paper would take as long as it describes, with slight modifications to the algorithm (and some preprocessing) it seems hopeful that we will be able to prove that it converges faster, and I was able to show that in certain cases it will converge faster (given a certain distribution of the points, and low dimension).
Week 5:
After discussing with Dr. Awasthi I was able to improve my proof and say that the slightly modified algorithm from before will converge quickly (still only in low dimensions) with a slightly worse error bound to the real mean. I wrote up the proof from last time for Dr. Awasthi and have now been exploring how to extend this argument to higher dimensions.
Presentations
Additional Information