Jasmine Khalil DIMACS REU 2024

DIMACS REU 2024

General Information

Student:	Jasmine Khalil
Office:	440 CoRE Building
University:	Penn State University
E-mail:	jkk5987@psu.edu
Mentor:	Dr. Pierre Bellec

Project Description

Optimization, learning and high-dimensional macroscopic limits:
The last decade has seen the emergence of new phenomena where complex statistical learning problems such as high-dimensional regression and classification can be accurately summarized by simple systems of equations. These simple systems of equations characterize the high-dimensional limit of the statistical learning problem at hand and provide new insights on regularization and the choice of statistical estimators in high-dimension. The project will explore problems in this line of though, requiring and developing skills in probability, statistical/machine learning, numerical programming and computational linear algebra (From DIMACS Website).

Weekly Log

Week 1:: This week, I dedicated most of my time to understanding my project. I met with my mentor, Dr. Bellec, on Thursday, and we discussed the different directions in which we could focus my REU summer project. We ultimately decided that a project relating to the Coupling from the Past method for perfect sampling from the stationary distribution of a Markov chain would be the most suitable and interesting topic.; I delved into online lectures and textbook chapters on Markov chain Monte Carlo algorithms to familiarize myself with the topic. To solidify my understanding, I developed Python simulations to demonstrate random walks around different Markov Chains, and using the Monte Carlo method, I was able to identify the stationary distributions.; Once I had a good understanding of MCMC and stochastic sampling, I started looking into the Ising model for ferromagnets to get a firmer grasp of the physics model. I read these lecture note summaries detailing Coupling from the Past with the Ising Model and this chapter from the textbook Finite Markov Chains and Algorithmic Applications by Olle Häggström.
Week 2:: At the start of this week, I presented my project to the DIMACS REU group which helped me identify my goals for the summer and what steps I need to take to achieve such goals.; I dedicated much of my time this week to gaining a better understanding of the Ising model by deriving and familiarizing myself with the Partition function and the Hamiltonian of the 1D model. I have gained a much stronger understanding of how the model is used to study the phase transition of the spin directions in ferromagnets that interact with their nearest neighbors and an external magnetic field.; I worked on deriving the two sums of the Hamiltonian of the Ising model, the first one which sums over all nearest neighbor interactions and the other comprising of the spins’ interactions with the external magnetic field. I also now better understand the phase transition concept inside the spin model due to temperature, where at the curie temperature, the internal energy increases, and the order of the spins is lost. I also worked on the simplification of the partition function using the transfer matrix method with the help of this text and this lecture summary
Week 3:: This week, I worked on building Python scripts to simulate simple examples of Coupling from the Past. I met with my mentor, Dr. Bellec, at the start of the week and discussed possible ways to implement CFTP in code. I started by creating an algorithm simulating random walks each starting at one of four states. In this initial simulation, each chain started at 0 and ended after completing 500 steps of updating its state by randomly increasing or decreasing to the state above or below itself after which all the states eventually coalesced.; I then dedicated the rest of my time to simulating the CFTP algorithm for 40 states by starting all of the chains in the past as far as necessary such that they coalesce by 0. This took some time to implement correctly and debug to achieve the desired outcome but I eventually got it. At the end of the week, I met with Dr. Bellec again and discussed my code as well as improvements to ensure my code runs accurately. We then discussed goals for next week including working on generating graphs simulating sandwiching for Markov Chains with the property of monotonicity and extending this idea to the Ising model.
Week 4:: I started this week by working on simulating Coupling from the Past using sandwiching on the same 40-state Markov Chain with monotonicity and worked with Dr. Bellec on recreating Figure 25.2 in this text. I then focused my time on simulating the Ising model for a 3x3 lattice in Python using two different methods. The first method used the update function in section 3.1.4 of these lecture note and the other in this chapter on sandwiching of this text. Both models choose a random vertex in the lattice and update the vertex to spin up/down depending on different update functions. The first method predominantly uses the sums of the nearest neighbors spin ups and downs to update while the second method uses the Hamiltonian of the model before the update and after and compares the two to decide if the update should take place. I am still currently working on debugging the second model as well as optimizing the first.
Week 5:: At the start of the week, I focused on debugging the second simulation to produce the correct results and optimizing the first model to produce larger grid simulations faster. I then worked on extending the code to also calculate the magnetization of the equilibrium grid depending on its nearest neighbor energy as well as energy due to an external magnetic field. I noticed that the second model took a significantly longer amount of time to simulate a larger grid (10X10), which was due to the convolution I used to calculate the Hamiltonian. We decided that using a convolution over the whole grid was not optimal, and a different approach of only computing a sum of spins over the neighbors of the vertex at (x,y) would result in faster computation. I am currently working on correctly implementing this method for computing the nearest neighbor energy.
Week 6:: I started this week by working on correctly implementing the simpler and faster function to calculate the nearest neighbor energy by significantly simplifying the update condition. To further optimize the code, we used Numba to compile the functions that took the longest processing time, which resulted in some errors because Numba does not work well with some standard Python objects. After debugging, we got the code to produce accurate results relatively quickly, so we began working on plotting our results. Dr. Bellec introduced me to Plotly to graph our results of a 40x40 grid of spins at 26 different temperatures ranging from very low temperatures to the Curie temperatures where magnetization is lost. I then worked on a plot to visualize the phase transition by plotting the average magnetization over 5 steps against the temperature where there is a quick rise and fall to an average of 0 magnetization after the Curie Temperature is reached. I will continue working on this over the next week as well.
Week 7:: We worked on plotting a graph of average magnetization versus inverse temperature/ critical temperature to represent the phase transition of a theoretical ferromagnet Ising model using these notes.In the notes, the author derives an iteration scheme that can be implemented to get the theoretical average magnetization at different values of inverse temperatures. I worked on coding this iteration scheme and plotted the result next to the result from last week’s work on the average magnetization over 5 steps using coupling from the past to represent the phase transition of a practical ferromagnet Ising model. The resulting two plots shared similar characteristics of spontaneous magnetization of a ferromagnetic material just before the critical temperature, and a sudden drop in magnetization to 0 past the critical temperature, as expected. We then used Pandas to build a data frame of magnetizations of the material at different inverse temperatures, repeating this five times. We used Seaborn to plot the magnetization results with an error bar to visualize the magnetization fluctuation at a specific inverse temperature each time the CFTP algorithm is called.
Week 8:: I focused my time this week on creating my slides for our final presentations on Wednesday and Thursday by summarizing the most important things I learned and the most significant results we produced. I also started working on my final paper and plan to continue working on it next week as well.
Week 9:: This was my final week so I spent my time finishing up and ploishing my final paper for my project. Reflecting back, this program was a wonderful experience and I learned so much and met so many amazing individuals along the way. I want to thank Dr. Pierre C. Bellec for his mentorship, and Dr. Lazaros Gallos and Omar A. Garcia for running and managing the REU and ensuring a great experience for everyone!

Presentations and Paper

First Presentation

Final Presentation

Final Paper

Acknowledgements

Thank you to my mentor Dr. Pierre C. Bellec for his mentorship! This work was carried out as a part of the 2024 DIMACS REU program at Rutgers University, supported by NSF grant CNS-2150186.