Sumi's REU Page

Project #1: Intelligent sampling for physics-informed multifidelity learning

Mentor: Professor Rajiv Malhotra

Abstract: Our group is working on tackling the curse of dimensionality in machine learning of practical engineering systems. We have found that biasing machine learning models with surprisingly simple first principle models can drastically reduce the experimental and computational cost of data generation for machine learning, in some cases by an order of magnitude. In this context this project will explore incremental sampling techniques that are crucial to the success of our approach. The REU student will work with an existing graduate student, and implement different incremental sampling methods in the context of transfer learning, with very high potential for a journal publication. Knowledge of python programming in the context of machine learning methods like SVRs, random forests, feedforward neural networks is necessary.

Weekly Logs

Week 1
May 31-June 2
The first week was primarily orientation. I settled into my apartment, attended the orientation lecture on Wednesday morning, and I began reading through some literature related to my project, which is about physics-informed multifidelity learning. Since the project focuses on incremental sampling techniques, I also read about asynchronous Bayesian optimization, which is an optimization technique that allows us to evaluate multiple models simultaneously and intelligently select the next set of solutions to evaluate. This is useful in PIML because model evaluation is often very computationally expensive. I also made an introductory presentation on my topic, which will be presented on Monday.
Week 2
June 5-June 9
This week moved a little bit slowly for me. I met with my mentor for the first time on Monday, and sent me some data to write a neural network. A glance through the data shows that it has to do with printing. We are given the nozzle-to-plate distance, extruder speed, and filament feed rate, and the average line width, and we want to train a model to predict the average line width. I trained a neural network and did some visualizations of the data. I also explored some python libraries for hyperparameter tuning for my neural network in order to improve the model even more. I continued reading more on PIML and I got a better grasp of multifidelity learning, as well as specific algorithms for asynchronous batch Bayesian sampling (Thompson's sampling and fantasizing).
Week 3
June 12-June 16
My mentor had a busy schedule and wasn't able to meet this week, so I worked on the printer dataset even more. I fit an SVR and I learned how to do hyperparameter tuning for the SVR as well. This was the first time that I have implemented an SVR model, so it was fun to see it in motion. I also did simple linear regression and experimented with regularization terms, but the SVR performed the best out of all of the models. I also attended several talks from the Modern Techniques in Graph Algorithms workshops. I understood very little of the content, but I am interested in graph theory and I have done an extensive project on graph neural networks, so it was interesting to see more advanced techniques. I plan to rewatch a few of the lectures and see if I can understand more of them next week.
Week 4
June 19-June 23
This week was a little hectic, but I am ultimately very happy about the outcomes of the week. On Monday, I met with my mentor, and he suggested that I try synbolic regression for the printer dataset. I had never heard of symbolic regression, so I read a few articles and watched some YouTube videos. Explainable AI is one of my interests, so it was cool to see how it can be implemented in regression tasks. Ultimately, however, I opted to change course and move to a new project focused on logistic regression in high dimensions. I met with my new mentor on Thursday, and I am optimistic about the new research that I will be doing in this area. On Friday, I got started reading the paper that he had assigned me, with the goal of recreating one of the figures from the paper. I also attended the Data Science Bootcamp lectures on Tuesday and Wednesday. While they were interesting, I decided to direct more time to reading about logistical regression, especially because my previous coursework in data science had already introduced me to many of the concepts discussed in the bootcamp.

Readings

Project #2: Optimization, learning and high-dimensional macroscopic limits

Mentor: Professor Pierre Bellec

Abstract: The last decade has seen the emergence of new phenomena where complex statistical learning problems such as high-dimensional regression and classification can be accurately summarized by simple systems of equations. These simple systems of equations characterize the high-dimensional limit of the statistical learning problem at hand and provide new insights on regularization and the choice of statistical estimators in high-dimension. The project will explore problems in this line of though, requiring and developing skills in probability, statistical/machine learning, numerical programming and computational linear algebra.

Weekly Logs

Week 5
June 26-June 30
This week, I became better acquainted with my new project. We want to define a curve for the existence of the maximum likelihood estimate (MLE) in high dimensional logistic regression parametrized by the ratio $\kappa = p/n$, where $p$ is the number of features and $n$ is the number of observations, and $\gamma = Var(\textbf{x}_i^T\beta)$ where $x_i$ is the $i$-th row of the feature matrix $X$ and $\beta$ is the logistic regression coefficients. The curve is already defined for logistic regression in two classes, and we want to find a similar curve for multinomial logistic regression. This week, I worked on recreating Figure 2a from this paper, which shows both empirical data and the theoretical curve of the phase transition for the existence of the MLE for binomial logistic regression. After reading a few sections of textbooks on logistic regression, I was able to generate an approximate figure, but it still looks wonky. I will work on it more next week.
Week 6
July 3-July 7
I continued working on recreating Figure 2a from this paper, and I was finally able to recreate the image after making some tweaks to evaluating the existence of the MLE. The original paper uses linear programming methods to find whether maximizing $\sum_{i=1}^ny_i(\textbf{X_i}^T\beta)$ has a solution given certain parameters in order to determine whether the MLE exists. However, we can also just test whether $y_i(\textbf{x}_i^T\beta) \geq 0$ for all $x_i$. This makes the code run faster, and it gives the correct result. I was also able to recreate the theoretical curve defined in the paper to show that it matches the empirical results.
Week 7
July 10-July 14
This week, I shifted focus towards evaluating the existence of the MLE for multinomial logistic regression. I started the week by reading this paper which discusses the existence of the MLE for Gaussian mixture models, and I also read some portions of a statistical learning textbook in order to learn the probability distributions for multi-class logistic regression. Following my recreation of the binomial logistic regression phase transition, I tried to show an empirical phase transition for logistic regression with three classes. I once again ran into an issue with testing whether the MLE exists, as my previous method of testing whether a hyperplane exists between the two classes only works in two dimensions. Instead, my mentor advised me to compare the predicted values and the actual values of the simulated data to test whether they were exactly the same. If this is the case, then the two classes are linearly separable and the MLE does not exist. For logistic regression for 3, 4, and 5 classes, I noticed that the phase transition did exist and followed a similar curve compared to the curve with 2 classes, but the curve seems to shift left. Next week, I will try to use the proof of the theoretical phase transition for binomial logistic regression to explain these results.
Week 8
July 17-July 21
This was the last week of the REU before a group of us left for a combinatorics workshop in Prague. I spent the beginning of the week finalizing my plots and preparing my presentation, which I gave on Thursday, July 20, and the latter half of the week was spent packing and briefly reviewing my combinatorics problem sets from last year in order to refresh my memory.
Week 9
July 24-July 28
This was the first week in Prague! In the mornings, we had introductory lectures on probablistic graph theory, combinatorial geometry, visibility problems, and algorthmic game theory. On Thursday, we had student presentations, and I gave an expository talk on graph neural networks, as I had done a project on them this past semester. In our free time, we traveled around Prague and worked on our final papers.

Readings