2021 DIMACS REU Projects

DIMACS is very proud for the achievements of the 2021 DIMACS REU students during their summer research experience. This page includes the abstracts of the final papers that the students submitted to the program, summarizing their summer research.

Anna Antal (Rutgers University-New Brunswick NJ) & Sarah Pritchard (Georgia Institute of Technology-Main Campus GA)
A note on the involutive concordance invariants of certain (1,1)-knots
Mentor: Kristen Hendricks, Mathematics

Abstract: In this project we worked with involutive Heegaard Floer homology, developed by Hendricks and Manolescu as a refinement to Heegaard Floer homology. Involutive Heegaard Floer homology introduced two new knot concordance invariants. These invariants are interesting, because unlike other concordance invariants like τ, ε, and ν, they do not necessarily vanish on knots of finite concordance order, like the figure-eight knot. We computed these two new invariants for all non-thin and non-L-space 10 and 11-crossing (1,1)-knots for which they were not yet known. We also studied (1,1)-diagrams, and the relationships between them and the resulting CFK^∞ chain complexes.

Sivasomasundari Arunarasu (Emory University GA)
Data-Guided Modeling of Cancer and Intratumor Heterogeneity
Mentor: Subhajyoti De, Rutgers Cancer Institute of New Jersey

Abstract: Although all cells in a tumor come from one original mutated cell, there is genetic, nongenetic, and microenvironmental variation among these cells, referred to as intratumor heterogeneity. Nongenetic heterogeneity, which includes epigenetic changes, is closely related to the cancer stem cell (CSC) model, which states that certain tumor cells are characteristic of regular stem cells, and are likely to contribute to tumor survival and regrowth after treatment. Being able to identify phenotypic features that can classify tumor cells into a stemlike and differentiated state can help to better understand the CSC model, and the characteristics of cancer stem cells. Using the data of cells grown from the T24 cell line, PCA (principal component analysis) and UMAP analysis in R was used to group tumor cells into two different classes, which are thought to correspond with more holoclone (stemlike) and more differentiated cells. This classification can be used to analyze the colony makeup of other tumors, and might help determine the likelihood that a tumor will regenerate after treatment, based on the proportion or density of holoclone and meroclone-like cells.

Kenneth Blakey (Brown University RI)
On the compactification of bounded edge flow trees
Mentor: Christopher Woodward, Mathematics

Abstract: Let L⊂J^1(M) be a closed Legendrian submanifold of the 1-jet space of a closed Riemannian manifold. We conjecture that the moduli space of flow trees with at most n edges has a natural compactification given by the moduli space of broken flow trees with at most n edges. In order to solve this one would have to compactify the various moduli spaces of gradient trajectories of local function differences determined by L. We establish a compactification of the moduli space of Morse trajectories - gradient trajectories connecting critical points of a fixed local function difference - similar to the compactification in standard Morse theory.

Hallie Byatt (Georgia Institute of Technology-Main Campus GA)
The Quantum Mechanics of an Electron Surrounded by Two Photons
Mentor: Shadi Tahvildar-Zadeh, Mathematics

Abstract: I formulated a relativistic Lorentz-covariant system of wave equations for a three-body, three-time one-dimensional system consisting of two photons and one electron. These particles interact only upon contact, where a boundary condition is applied to preserve the conservation of probability current. The solutions for various configurations of the initial-boundary value problem were found, and hopefully will pave the way to finding the solutions for a system of one electron surrounded by N photons.

Jiajun Chen (Stony Brook University NY) & Moriah Walker (Rutgers University-New Brunswick NJ)
Graph Cities Experimentation
Mentor: James Abello, Computer Science

Abstract: The applications Graph Cities and Graph Strata are able to decompose graphs of over 1 billion edges into smaller subgraphs that can fit on a screen and be interactively explored without losing the context of the original graph. Observing these local subgraphs reveals a variety of patterns. This report serves to find a way to describe these patterns as well as unpack meaning from these patterns by generating data stories and detecting semantically important vertices through the vertex statistics, activity and diversity.

Yun Huen Cheng (Mount Holyoke College MA)
Data Analysis on COVID-19 patterns
Mentor: Lazaros Gallos, DIMACS

Abstract: The COVID-19 pandemic has a global impact due to its high infectivity and due to human travel behaviors. Focusing on geographic patterns to analyze the evolution of the pandemic, we can determine which areas are at risk of an outbreak. By analyzing the spatial correlations of new active cases in the US at the county level we can show the extent of these correlations at different times. Our results show that the epidemic was largely not uniform but formed clusters. We found that in the first few months long-range spreading was mainly located in cities rather than rural areas. In addition, there exists a percolation transition in November 2020, and a smaller transition in January 2021, both corresponding to peaks of the epidemic.

Emily Chin (Harvey Mudd College CA)
Machine Learning for SAT-solving heuristics
Mentor: Periklis Papakonstantinou, Management Science and Information Systems

Abstract: 3-SAT is a known NP-Hard problem in theoretical computer science. In other words, it takes exponential time to deterministically solve this problem. However, there is a fairly effective random algorithm that finds the solution quickly most of the time, if there is one. This "Random Walk Algorithm" finds the solution by acting randomly at each timestep, but the results may not be completely random. This leads us to explore what features about an instance of the problem are important and how we can use these features to predict if a solution can be found at the end of a certain number of iterations of the algorithm. We used various machine learning techniques to make predictions using features that we found to be significant.

Anna Cusenza (University of California-Los Angeles CA)
Data-Driven Dynamics
Mentor: Konstantin Mischaikow, Mathematics

Abstract: Our goal is to interpret the dynamics of multi-parameter nonlinear dynamical systems. Such systems can have very complicated behavior. Because of this and the fact that in applications data collected may not accurately represent the generating system, it is desired to gather from the data robust patterns which are consistent with respect to perturbations in parameter values. A method to analyze data generated by such systems, as introduced in previous work, is to discretize the dynamics via a decomposition of the compact phase space into square grids. With a grid decomposition of the phase space, a directed graph which approximates the behavior of the system for a fixed set of parameter values is created. Each node in the directed graph represents a cubical cell in the phase space, and the directed edges denote where each cell maps to under the system. With graph algorithms we can find regions which contain recurrent dynamics, and using Conley index theory we can further identify special properties of the system. In this project we use the method of analysis described above, with the change being that we decompose the phase space using Voronoi cells rather than cubical cells. A Voronoi cell complex is generated by a finite set $P = \{x_1,x_2,\dots, x_n\}$, where each $x_i$ is chosen in the phase space. Each Voronoi cell $V_i$ is defined by $V_i = \{x|d(x,x_i)\leq d(x,x_j) \text{ for all } j\neq i,\ 1\leq j\leq n\}$. Our goal is to determine whether using Voronoi cells to decompose the phase space yields comparable results to the results generated by the cubical complex; in visual quality and accuracy. The goal is to apply these methods to real data collected from systems generated by weather patterns and robotics.

Cheng-Hao Fu (University of Michigan-Ann Arbor MI)
Explorations in Tensor Phase Retrieval
Mentor: Anand Sarwate, Electrical and Computer Engineering

Abstract: This paper outlines a project that started in the summer of 2021 with the goal of coming up with a better method than existing ones to solve the Low Rank Phase Retrieval Problem, primarily in terms of sample complexity, but also potentially in runtime. It remains a work in progress.

Jacob Gorenburg (Haverford College PA)
Combinatorial Options Markets
Mentor: David Pennock, DIMACS

Abstract: This paper is an extension of the work done by Xintong Wang, David Pennock, and others on combinatorial options markets. Their original paper focused on the design and time complexity of an exchange that accepts call and put options on multiple stocks in a single bundle. These options give traders greater flexibility and precision in their trading. The original work focused on determining the optimal match in single-instance auctions. We extend this in two ways: providing a system for transferring the surplus from these trades back to the traders and designing a continuous combinatorial auction. One of the key principles of modern exchanges is that their profits come from a small fee on every trade, not from any surplus that occurs during the course of trading. If one trader is willing to buy a stock at a given strike price for $10 and another is happy to sell for $7, the $3 surplus is given to whichever trader placed the second bid. In a normal market, the surplus is always a flat amount since identical options are being bought and sold. However, in a combinatorial market, the surplus potentially also contains a variable surplus that depends on the relative successes of the various options. In this paper, we propose a method of transferring the surplus in a combinatorial market from the exchange to a trader by providing them with a bundle of options. In addition to dealing with the surplus, this paper extends the idea of a combinatorial market to the continuous-style auctions found in real-world exchanges. Wang's work focuses on a single-instance auction where all offers are collected and a single optimal match is returned. In practice, most markets are running continuously and decide on trades when new bids are placed. This style of auction allows for faster trading and benefits quick traders over those with better offers. In this paper we introduce a mechanism for a continuous combinatorial market and make some progress on an algorithm for running one.

Tavis Johnson (Iona College NY) & Delta Lyczak (Lafayette College PA)
Programmable Routers: RMT and P4
Mentor: Srinivas Narayana, Computer Science

Abstract: Internet routers deal with significant amounts of internet traffic in everyday use. This increasing amount of traffic over the internet affects the performance, and speeds, at which these routers operate. The goal of our research project is to understand how routers manage processing packets at such high speeds while remaining flexible and reconfigurable. Through this project, we have come to understanding layers of networking-abstraction, key networking concepts, pipelined systems, along with the history of routing software, hardware, and information on completed P4 programming exercises. With this knowledge, we hope to be able to design improvements for the performance of routers and the management of campus network resources.

Sangjun Ko (Rutgers University-New Brunswick NJ)
Morse Flow Trees and Chain Complexes
Mentor: Christopher Woodward, Mathematics

Abstract: Morse flow trees are a tool to compute Legendrian contact homology, and Ekholm has showed that these flow trees have a correspondence with certain types of rigid pseudoholomorphic disks in T*M if M is a manifold. There are still some open questions regarding the space of these flow trees, some of which we investigated this summer. Our aim here will be to describe some of the background material required to understand the definition of Morse flow trees. First we discuss the elements of Morse Theory and define the Morse complex; then we discuss a little bit of symplectic topology so that we define what a Lagrangian submanifold is; finally, we give the definition of a flow tree and introduce a problem related to it.

Erin McGowan (Rutgers University-New Brunswick NJ)
Comparing Physics-Informed Loss Functions for Porosity Prediction in Laser Metal Deposition
Mentor: Weihong 'Grace' Guo, Industrial and Systems Engineering

Abstract: Laser metal deposition (LMD) is a type of additive manufacturing (AM) during which metal components are created using a laser beam that fuses metal powder by melting it as it is deposited. Porosity, or small cavities that form in this printed structure, is generally considered to be one of the most destructive defects that can occur in metal AM. Currently, porosity can be measured after printing with CT scans. While this is useful for understanding the nature of pore formation and its characteristics, purely physics-driven models lack real-time prediction ability. Meanwhile, a purely deep learning approach to porosity prediction leaves valuable physics knowledge behind. Here we create a hybrid model that takes advantage of both empirical and simulated LMD data to show how various physics-informed loss functions impact the accuracy, precision, and recall of a baseline deep learning model for porosity prediction. In particular, we find that some versions of the physics-informed model are able to improve upon the precision of the baseline deep learning-only model (albeit at the expense of overall accuracy). This work lends itself to a wide array of applications as LMD is frequently used in the automotive, aerospace, energy, petrochemicals, and medical industries.

Soham Palande (Rutgers University-New Brunswick NJ)
Co-evolution of Opinions and Signed Network dynamics in Real World Scenarios
Mentor: Jie Gao, Computer Science

Abstract: With the pervasion of social media in modern society, evolution of opinions and network dynamics in social networks have gained special interest from industry, academia, and governments. The study of opinion and network dynamics has several applications- social sciences, behavioral economics, game theory- with consequential implications. Adversarial attacks through fake news and troll bots are increasingly common and there is a need for a rigorous understanding of their impact on individuals and the social networks. We study opinion and network dynamics and characterize social networks at their limit states extending recent work in co- evolution of signed networks and opinions to better represent the real world. We build upon intuition of the real world, in which individuals hold multiple opinions on a range of issues for example- healthcare, gun control and seek to better represent complex relationships/social ties between individuals in a network. We formulate these intuitions mathematically and incorporate them in a co-evolution model which combines signed network dynamics based on structural theory and opinion dynamics. We conduct a comprehensive study through numerous simulations and validate the model on the benchmark Zachary's Karate Club dataset to achieve 100% accuracy. We mathematically address divergence of social ties to infinity, which in the real world would translate to limitless polarization, by introducing normalization which keeps the relative magnitude of the social ties intact but prevents divergence.

Aaditya Raghavan (Georgia Institute of Technology-Main Campus GA)
On Spanning Tree Counts for Bipartite Graphs
Mentor: Bhargav Narayanan, Mathematics

Abstract: Ehrenborg conjectured that, for a simple bipartite graph G, the number of spanning trees of G, is bounded above. The bound is known to hold with equality for a class of bipartite graphs known as Ferrers graphs, and not necessarily with equality in other, specific classes of bipartite graphs. In this paper, we introduce another inequality motivated by the Cartesian product of bipartite graphs - in particular, an inequality that relates the eigenvalues of the Laplacians of the multiplicand graphs, with products of the vertex degrees and the part sizes of the factor graphs. Furthermore, we introduce an inductive framework through which the conjecture can be approached by means of a determinantal inequality, along with some preliminary results that can be proved using such a technique.

David Ryzak (Charles University (Prague, Czech Republic)) & David Sychrovsky (Charles University (Prague, Czech Republic))
Unique Solution to the House Assignment Problem for N = 4
Mentor: Ron Holzman, Mathematics, Princeton University

Abstract: We study a problem from mechanism design where we try to find a mechanism which assigns N houses to N players based on their preferences with desirable properties. These aim to unsure fairness and efficiency while limiting exploitation by the players and give incentives to tell the truth. We aim to show that the Random serial dictatorship (RSD) is the only mechanism with these prescribed properties. It has been shown before that RSD in unique for N=3. We provide a computer aided proof of uniqueness for N=4 as well as rigorous for selected preference profiles for arbitrary N.

Yousef Sayes (New Jersey Institute of Technology NJ)
Identifying Bifurcations for Combinatorial Dynamical Systems
Mentor: Konstantin Mischaikow, Mathematics

Abstract: The identification of higher order bifurcations in combinatorial dynamical systems allows for deep insights into the underlying structures of these systems. The modeling of such phenomena, specifically biological ones, in a combinatorial fashion has been explored; however, a systematized process to which higher order bifurcations may be found through is still nonexistent. The use of multiple paths between given adjacent parameter nodes and applying slow-fast dynamics to represent transitions between these nodes point us toward the discovery of such bifurcations. A process to identify bifurcations therefore allows us far greater insight into these systems, especially as their complexity increases.

Fiona Shafer (Rutgers University-New Brunswick NJ)
Health Disparities & COVID-19
Mentor: Christie Nelson, Masters of Business and Science in Analytics and CCICADA

Abstract: As the COVID-19 pandemic progresses across the nation and the world, data suggests the virus disproportionately impacts marginalized communities. Specifically, disparities exist in infection rates and disease outcomes by race and ethnicity in the United States. However, it is unknown to what degree these differences have impacted specific communities in the past year. During the course of this research project, I intend to research various avenues related to COVID-19 including but not limited to long term care facilities and education systems.

Noah Singer ( Harvard University MA) & Vishal Ramesh (Charles University (Prague, Czech Republic)) & Sasha Sami (Charles University (Prague, Czech Republic))
Simple reductions to circuit minimization
Mentor: Eric Allender, Computer Science

Abstract: The complexity of the minimum circuit size problem (MCSP) - and its many variants, including the minimum Kolgomorov time-complexity problem (MKTP) - is linked intricately to countless other questions in theoretical computer science. For instance, NP-hardness of MCSP or MKTP is known to imply ZPP ≠ EXP, and even if such reductions exist, they cannot be nearly as simple as the standard NP-completeness reductions for other problems. In this project, we study whether various variants of circuit minimization can be complete for NP, or smaller classes, under simple types of reductions. Specifically, we investigate the following questions. Can recent hardness results for MKTP be extended to MCSP? How important is adaptivity in NP-hardness reductions for circuit minimization? How robust is the recent definition of non-interactive statistical zero-knowledge with logspace-bounded verifiers and simulators?

Glenn Sun (University of California-Los Angeles CA)
Lower Bounds for Deterministic Graph Coloring in the Streaming Model
Mentor: Sepehr Assadi, Computer Science

Abstract: In the graph semi-streaming model of computation, the input graph arrives as a stream and the goal is to compute solutions in O(n polylog n) space. With randomization, Assadi, Chen, and Khanna recently showed that the coloring problem for graphs of maximum degree Δ admits a solution in this model using an optimal Δ + 1 colors. However, little is known about this problem in deterministic settings. In this project, we showed that when Δ = Ω(n^{2/3}), no deterministic streaming algorithm can output proper colorings using O(n^{1-ε}) colors, for all ε > 0. In particular, any general algorithm for deterministic graph coloring must use at least Ω(Δ^{3/2 - ε}) colors.

Emily Thompson (Southwestern University TX)
Predicting Dissolution Rates of Volcanic Glass Using Graph Neural Networks
Mentor: Shashanka Ubaru, IBM Watson Research Center

Abstract: Graph neural networks (GNNs) in recent years have gained popularity due to the increasing need for machine learning models to accommodate graph structured data. Interdisciplinary fields such as material informatics use graph neural networks to conduct research on materials that are otherwise difficult to analyse. One of these materials, volcanic glass, is used in storing nuclear waste due to their durability in extreme conditions. This durability, measured by its dissolution rate, is difficult to determine in a traditional lab environment due to the extensive time and resources needed to collect and analyse samples. I seek to implement a GNN model that performs regression in order to predict the dissolution rate of ten different types of volcanic glass. In addition to implementing the GNN, I explore varying methods of optimizing the performance of the model. Results demonstrate that unknown glass materials and their accompanying dissolution rate can accurately be determined the GNN in a short period of time. The long term goal for this research is to be able to discover new glass materials based on their dissolution rates.

Hwai-Liang Tung (Brown University RI)
Bayesian inference of the origin of an infection process over a network
Mentor: Min Xu, Statistics

Abstract: Many processes such as the diffusion of information or fake news over a social media network or the transmission of disease may be modeled with an infection process. We investigate the problem of finding the patient zero using noisy observations of a set of infected nodes. Given a set of detected infected nodes and the background graph, we present a MCMC algorithm to calculate the probability a node is the patient zero. We also prove that for an infinite lattice Z^d and a set of n infected nodes, for any ε-level, there exists a credible set C_ε such that there are O(√ n log^(2d)n) nodes in C_ε.

Zoe Wefers (McGill University)
Tetramer-Dependent Elastic Energy Model for DNA Conformation
Mentor: Wilma Olson, Chemistry and Chemical Biology

Abstract: DNA folding plays an important role in gene regulation and genome packaging. The ability for a piece of DNA to bend is largely impacted by the elastic energy stored between base pairs in a segment of DNA. There are well-established techniques to mathematically model the elastic energy of a collection of DNA base pairs. It has long been known that the intrinsic elasticity parameters of DNA are sequence-dependent, but past work only models DNA folding using elasticity parameters that are specific to dimers, a sequence of two base pairs. Recent studies have suggested the intrinsic elasticity parameters of DNA are actually unique to tetramers, a sequence of four base pairs. We present software that can simulate DNA folding by optimizing elastic energy using tetrameric steps. With this new software, we optimized the configuration of DNA minicircles and confirmed that increasing the scope of sequence dependence from dimers to tetramers has an effect on DNA conformation.

Eric Xue (Yale University CT)
Toward a fair and strategyproof tournament rule for selfish agents
Mentor: Ariel Schvartzman, DIMACS & David Pennock, DIMACS

Abstract: A tournament on n agents is a binary relation that describes the outcomes of the $\binom{n}{2}$ matches played between two distinct agents. Tournaments play an important role not only in sporting events but in any setting where any two agents are comparable and some "most qualified" agent should be selected, such as elections. The winner of a tournament is determined by a tournament rule that maps tournaments to probability distributions over the agents. We want these rules to be fair (i.e., choose a high quality agent) and robust to strategic manipulation. Prior work has shown that under minimally fair rules, manipulations between two agents can be prevented when utility is nontransferable but not when utility is completely transferable. Motivated by the fact that neither utility model reflects reality, we consider pairwise manipulations under partially transferable utility. These situations arise, for example, when agents care about winning themselves but are still willing to collude if joint gains offset perceived individual losses. We show that previously studied tournament rules require high levels of selfishness in order to prevent pairwise manipulations and that for fixed levels of selfishness, there are always two agents who stand to gain a constant positive amount of probability under these rules by manipulating in large enough tournaments. We also show that for stronger notions of fairness, non-manipulable tournament rules are closely related to tournament rules that witness decreasing gains from manipulation as the number of agents increases. A major open question that remains is whether there is a tournament rule that is fair and non-manipulable for a fixed level of selfishness that meets the lower bound we prove. We conjecture that indeed such a rule exists.

Yixuan (Connie) Zhang (Macalester College MN)
Deep Genomic Analysis of Tumor Specimen
Mentor: Hossein Khiabanian, Rutgers Cancer Institute of New Jersey

Abstract: Understanding complexity, dynamics, and stochastic patterns in genomic data - concepts native to physics and mathematics - is critical for elucidating how disease states originate and evolve. In this project, we focus on the application of the Tunable Biclustering Algorithm (TuBA) to examine genetic and clinical data of cancer patients, aiming to identify genetic markers that can provide us insight into how cancers evolve. Our ultimate goal is to develop novel statistical platforms for fast translation of genomic data into clinical practice.

Return to REU Home Page
DIMACS Home Page
DIMACS Contact Information
Page last modified on July 28, 2021.