2018 DIMACS REU Projects

DIMACS is very proud for the achievements of the 2018 DIMACS REU students during their summer research experience. This page includes the abstracts of the final papers that the students submitted to the program, summarizing their summer research.

Andrew Brettin (University of Minnesota-Twin Cities MN)
A stochastic operator-splitting method for simulating the development of intratumor heterogeneity
Mentor: Subhajyoti De, Rutgers Cancer Institute of New Jersey

Abstract: Cancer is a genetic disease which begins from a single aberrant progenitor cell, which successively divides into pairs of daughter cells with similar reproductive capacities. As the tumor grows, many distinct genetic lineages may proliferate throughout the neoplasm, resulting in significant intratumor heterogeneity. Such high levels of genetic diversity within tumors is associated with low survival rates for patients, as genetically-distinct cell variants have differential sensitivities to current therapies. Despite its importance, how intratumor heterogeneity develops is not well understood. However, mathematical models can provide insight into potential causes. This work applies a stochastic operator-splitting method developed for simulating chemical reaction-diffusion processes to model neoplastic growth. It is shown that high differential birth-rates, high diffusion rates and early onset of mutations can result in substantial levels of intratumor heterogeneity.

Dalton Burke (University of Colorado Denver/Anschutz Medical Campus CO) & Elise Catania (University of Rochester NY)
Codes for Storage with Queues for Access
Mentor: Emina Soljanin, Electrical and Computer Engineering

Abstract: With the rise of big data and machine learning, computers are used frequently to process and compute large quantities of data. An incoming job is parallelized, divided among different servers. Parallelizing introduces the issue of stragglers, the slowest workers. It is known that MDS coding schemes can be utilized so that the last servers to finish do not slow down the entire job. The known hierarchy of job scheduling policies from most efficient to least efficient is JSQ (join shortest queue), power-of-d, round robin, and random. The service time of balanced and incomplete block design (BIBD) is compared to the service time of other schemes in the hierarchy. BIBD is also tested with other models for which the hierarchy is unknown. At first, simplifying assumptions are made and then, a more realistic model of service time including both job specific randomness and server specific randomness is discussed. As a result, in certain scenarios, BIBD outperforms the other schemes in the hierarchy.

Debra Chait (Macaulay Honors College at Queens College NY) & Alisa Cui (Yale University CT) & Zachary Stier (Princeton University NJ)
A Taxonomy of Crystallographic Sphere Packings
Mentor: Alex Kontorovich, Mathematics

Abstract: The Apollonian circle packing, generated from three mutually-tangent circles in the plane, has inspired over the past half-century the study of other classes of space-filling packings, both in two and in higher dimensions. Recently, Kontorovich and Nakamura introduced the notion of crystallographic sphere packings, n-dimensional packings of spheres with symmetry groups that are isometries of Hn+1, which can be related to configurations of planes in Hn+1 for various quadratic forms in n+2 variables. When applied in conjunction with the Koebe-Andreev-Thurston Theorem, Kontorovich and Nakamura's Structure Theorem guarantees crystallographic packings to be generated from polyhedra in n=2. The Structure Theorem similarly allows us to generate packings from the reflective extended Bianchi groups in n=2 by applying Vinberg's algorithm to obtain the appropriate Coxeter diagrams. In n>2, the Structure Theorem when used with Vinberg's algorithm allows us to explore whether certain Coxeter diagrams in Hn+1 for a given quadratic form admit a packing at all. Kontorovich and Nakamura's Finiteness Theorem shows that there exist only finitely many classes of superintegral such packings, all of which exist in dimensions n<21. In this work, we systematically determine and enumerate crystallographic sphere packings arising from polyhedra on up to seven vertices, all crystallographic packings arising from Bianchi groups, and all known examples of crystallographic packings arising from higher dimensional quadratic forms.

Gabriel Eiseman (Rutgers University-New Brunswick NJ)
A Partial Solution to the Baragar-Kontorovich Conjecture
Mentor: Alex Kontorovich, Mathematics

Abstract: I investigated the minimality of a construction for the fourth tangent circle to three tangent circles (aka the Apollonian circle). Baragar and Kontorovich (my mentor) discovered a general 7 step construction and I showed that under certain assumptions about arbitrary points it is indeed minimal. I also found a 5 step construction for when the circles’ centers form a right triangle and showed it and known constructions for isosceles (5 steps) and equilateral (3 steps) triangles are minimal. I am pretty sure the assumption I made about arbitrary points, which is that only a certain subset of them actually need to be considered, is correct, but have not formalized the argument.

Christopher Espana (Rutgers University-New Brunswick NJ)
Atrial Fibrillation and Onset of Pulmonary Embolism
Mentor: Javier Cabrera, Statistics and Biostatistics

Abstract: Pulmonary embolism affects hundreds of thousands of people across the United States and can cause complications such as hemoptysis or cardiac arrest. Pulmonary embolism is characterized by an embolus becoming lodged in a pulmonary artery, where the origin of this embolus is often attributed to deep vein thrombosis. Atrial fibrillation often causes emboli to develop in the heart and propagate through the body. Through the use of the Myocardial Infarction Dara Acquisition System (MIDAS), a database of discharge data on myocardial infarction patients in New Jersey, we seek to study the relationship between atrial fibrillation and pulmonary embolism, in order to understand if there exists a relationship between the diseases.

Heman Gandhi (Rutgers University-New Brunswick NJ) & Adam Jamil (Rutgers University-New Brunswick NJ)
Expressing Grothendieck Polynomials Using Rectangles
Mentor: Anders Buch, Mathematics

Abstract: The structure of Γ, the bialgebra of stable Grothendieck polynomials, is considered and a basis is shown to contain various classes of of polynomials, particularly hook shapes, 2-row shapes, and certain 3-row shapes.

Nicholas Georgiou (University of Virginia-Main Campus VA)
An Adaptive Neurorehabilitation Robotics Electroencephalography (EEG) Game
Mentor: Konstantinos Michmizos, Computer Science

Abstract: Through a combination of neuroscience, robotics, and computer science, neurorehabilitation robots (NRs) can help patients with neurological conditions recover through therapy. Many times, patients may have impaired ability in moving body parts such as their arms. The Bionik InMotion ARM is used as the NR platform for this project. The NRs give optimal results when patients have cognitive engagement (CE) while they perform highly repetitive tasks involving the affected body part. An efficient and effective approach to NR therapy while maintaining CE is performing these highly repetitive tasks in the form of serious games. The NR adapts to the patients' behavior or ability in order to maintain active participation from the patient and to encourage improvement. A key to improving the effectiveness of NR therapy is understanding the neurophysiological components to movement of specific body parts that can help brain plasticity and lead to better motor recovery, so another aspect of this project was developing a built-in measure of these components into the robotic system. This is done through the 128-channel Biosemi Active electroencephalography (EEG) system, along with motor behavior metrics of gameplay performance. So, the focuses on developing comprehensive representations of CE when performing neurorehabilitation tasks during therapy and creating new adaptive rehabilitation strategies that can further help patients in their recoveries.

Ryan Gross (Rutgers University-New Brunswick NJ)
Approximate computing: An effective likelihood-free method with statistical guarantees
Mentor: Minge Xie, Statistics and Biostatistics & Suzanne Thornton, Statistics and Biostatistics

Abstract: Approximate computing is a field of statistical inference techniques that can be used to produce estimates given complex sets of data. Approximate Bayesian Computing (ABC) and Approximate Confidence Distribution Computing (ACC) are two such methods that do not require the specification of a likelihood function, and hence can be used to estimate posterior distributions of parameters for simulation-based models. We look to apply these methods to a large data set, namely one containing pedestrian entrance and exit data for Madison Square Park in New York City. Given the technical and financial restraints of the counting procedure, much of the data is either missing or otherwise prone to error. Therefore, we propose several models to characterize the pedestrian traffic flow throughout the park. Then, simulation techniques are applied to replicate the missing data and produce distributions of simulated data. This leads to the application of ABC and ACC methods, which are used to ultimately produce accurate estimates of the total number of park users over a given time period.

Caitlin Guccione (University of Rhode Island RI)
Elucidating Tumor Evolutionary Patterns using High-Depth Molecular Data
Mentor: Hossein Khiabanian, Rutgers Cancer Institute of New Jersey

Abstract: Cancer is the second leading cause of death in the United States and yet it only has two main treatments, radiation and chemotherapy. A more efficient way to eliminate cancerous cells is with a targeted approach. In order to create more effective precision medication, there exists a need to understand how cancer develops and and to determine which cancerous mutations are most frequent in patients. The optimal way to answer these questions is by sequencing cancer tumors and tracking mutations over time with the help of mathematical trees. We use two genetic distances, Hamming and Nei to help structure the trees. We conclude that Nei's distance does a better job of accurately reflecting the changes in mutations over time and thus can be used in the future to track the evolution of cancerous cells.

Xiaotong Gui (Pomona College CA) & Xinru Liu (Wheaton College MA)
Multimodal Data Fusion in 3D Printing Quality Prediction
Mentor: Weihong 'Grace' Guo, Industrial and Systems Engineering

Abstract: This study focuses on the analysis of 3D surface measurements and quality prediction of 3D-printed dome-shaped objects using multimodal data fusion. Dimension, profile, and surface roughness were measured and represented in image data. Dimension reduction techniques were employed for extracting spatial patterns from the measurement images. Quality metrics were developed using profile deviation and surface roughness. In the end, classification and regression models were built to predict quality. The results propose feature extraction from high-dimensional image data as a promising technique for efficient and automated quality inspection.

Scott Harman (Rutgers University-New Brunswick NJ)
Curve-shortening Flow and Surfaces
Mentor: Christopher Woodward, Mathematics

Abstract: The curve-shortening flow is a process that modifies curves in the plane according to a certain partial differential equation. With certain starting conditions, the flow will shorten the perimeter and cause the curve to collapse to a circle, and then to a point, in finite time. The flow in the plane has been extensively studied, but we extend the notion of the flow to surfaces. We investigate how the evolution equation must be adapted so that the flow is well-defined, and then analyze certain surfaces and families of curves that provide a solution on those surfaces. We then pose certain open questions that will be studied further.

Aubrey Hormel (Missouri State University-Springfield MO)
Implementation of Formation Control and Path Scheduling in Multi-Agent Systems
Mentor: Jingjin Yu, Computer Science

Abstract: To move a group of agents over a graph from some initial to arbitrary target positions, I implement previously proposed algorithms for formation control and agent path scheduling. The arbitrary graph simple and connected, with unit edge lengths between adjacent vertices. Moreover, agents in the system are indistinguishable and allowed to travel to the best position in the target formation as determined by the control policy. These algorithms ensure that over all agents, a minimum distance is traveled, give a definite amount of time for agents to achieve the target formation, and eliminate the possibility of agent collision.

Rahul Ilango (Rutgers University-New Brunswick NJ) & Neekon Vafa (Harvard University MA)
The Non-Hardness of Approximating Circuit Size
Mentor: Eric Allender, Computer Science

Abstract: Understanding the computational difficulty of the Minimum Circuit Size Problem (MCSP) dates back to the 1950s and has wide-ranging implications in theoretical computer science. Since MCSP is not known to be NP-hard and is believed not to be in P, it is seen as a promising NP-intermediate candidate. However, despite extensive study of MCSP, there is little evidence that it is not NP-hard. In a recent development, Murray and Williams show that MCSP is not NP-hard under TIME(n0.49) projections. Murray and Williams also conjecture that MCSP is not NP-hard under logtime-uniform AC0 reductions. We show that any super-constant multiplicative approximation of MCSP is not hard for NP (in fact, not hard for PARITY) under even non-uniform AC0 many-one reductions. This result is surprising (and nearly tight) because Allender and Hirahara show that there is a constant-factor approximation to MKTP, a problem closely related to MCSP, that is hard for PARITY under non-uniform NC0 many-one reductions. To much frustration, this PARITY hardness result is not known for MCSP, and we show that the natural way to extend it to MCSP is actually impossible.

Froylan Maldonado (San Diego City College CA)
Chow Rings of Matroids
Mentor: Nicola Tarasca, Mathematics

Abstract: The idea of matroids dates back to the 1930s where mathematician Hassler Whitney introduces the abstraction of linear independence. After the subject was introduced several other mathematicians started contributing to the field; it soon expanded into different areas of interest like: coding theory, topology, and graph theory. For what we're doing, the reader doesn't require any background knowledge of matroids; only the basics of abstract algebra and graph theory. Matroids have a lot of interesting properties and also have many different ways of being defined. They can be defined from matrices, basis of different vector spaces, graphs, projective planes, and etc. Even though many of these objects don't seem to have a clear relantionship, they all are considered matroids. This lead us to believe that studying some more trivial cases of matroids might possibly reveal some unknown nuances of more complicated matroids.

Daniel Nakhimovich (Cooper Union for the Advancement of Science and Art NY)
Visualization of k-connected Components and Minimum Separating Sets of Fixed Points of Degree Peeling
Mentor: James Abello, Computer Science

Abstract: Graphs are an excellent form for the visualization of data because they clearly show how individual data points are connected. However, for large sets of data, a direct visualization of a graph is indecipherable to the human eye. Separating sets and k-connected components are interesting structures in graphs that highlight critical data points and clusters of highly connected points respectively. We developed an algorithm that uses minimum separating sets to decompose a graph into a hierarchy of k-connected components. The complexity of the algorithm depends linearly on the number of k-connected components in the graph. For each k-connected component K=(V,E), however, the complexity of finding its minimum separating set is $O(nm\binom{n}{2})$ where n=|V| and m=|E|. By using different approximate procedures this complexity can be improved to O(nm) or at more cost to accuracy to O(n+m). Performing the separating set decomposition creates a tree-structured map of the decomposed graph that more easily shows the connectivity of the graph and consequently the data it represents.

Ruby Ortiz (Muhlenberg College PA)
The Chromatic Polynomial of a Graph
Mentor: Nicola Tarasca, Mathematics

Abstract: The chromatic polynomial is shown to be a characteristic of the graph that demonstrates a relation. Then this idea is developed more with matroids and a diversity of graphs. A matroid can take up different forms from matrices to graphs to simple sets. There are multiple proofs of each graph’s chromatic polynomial using the Deletion-Contraction Relation, also presented in Huh's article. The deletion-contraction develops an interesting relation between graphs that can then be applied further to matroids. The chromatic polynomial of a graph has and continues to expand beyond 2-D graphs and even 3-D graphs, which is where work continues to develop.

Ryan Rice (The University of Texas at Austin TX)
Faster Convergence of Robust Mean Estimation in One Dimension
Mentor: Pranjal Awasthi, Computer Science

Abstract: The problem of learning the mean of a distribution from samples with at most an ε-fraction of adversarial corruptions has been the subject of much recent work. Several current breakthroughs are polynomial time algorithms for high dimensional data in specific classes of distributions. We consider the problem of robust mean estimation for "resilient" distributions and show that with an additional preprocessing step and different analysis, the current best polynomial time algorithm will only need O(log n) iterations of outlier removal in one dimension, as opposed to O(n) in the worst case. This is significant due to the costly outlier identification process involving principal component analysis or semidefinite programming in these algorithms. Our work could provide insight for future approaches on improving the practicality of robust mean estimation for non-gaussian distributions.

Evaristo Rodriguez (San Diego City College CA)
Shiny ICD-9 web app for the Cardiovascular Institute
Mentor: Javier Cabrera, Statistics and Biostatistics

Abstract: For hospitals one of the main problems in the process of generating mortality and morbidity data that is common to all countries, this requires both statistical and medical expertise. Medical doctors give a list of billing quotes, called ICD, however, this part is manually input, it is very time consuming, and increases the possibility for errors. This makes it difficult for the medical team to clearly assess the data. Which is why the creation of a web application with the utilization of the IDE "RStudio" and its many packages, it would facilitate the use of ICD-9 codes for medical teams. The app would help this process by helping the doctors get a better diagnosis and being more efficient in every possible way, by eliminating the possibility for manually input data errors, being a fast resource, and a trusted app.

Sherry Sarkar (Georgia Institute of Technology-Main Campus GA)
A Comparison of SAT Heuristic-Based Markov Chains
Mentor: Periklis Papakonstantinou, Management Science and Information Systems

Abstract: Schoning's algorithm and the break algorithm are SAT heuristic based Markov chains that have performed exceptionally fast compared to a brute force exponential approach. These Markov chains have been known to work well on random SAT instances over industrial SAT instances. In the first half of this paper, we discuss some sufficient conditions and examples in which one algorithm would perform better than the other in an effort to answer the general question of which SAT solver works better given a random SAT formula. In the second half of this paper, we construct a set of operations on Markov chains that preserve mixing time. Two operations in particular have been stated and proved here - the splitting of a state and the merge of two states.

Pejmon Shariati (Rutgers University-New Brunswick NJ)
Matroids and the Correlation Constant
Mentor: Nicola Tarasca, Mathematics

Abstract: Matroid Theory is one of the current hot topics in the mathematics world. My project is to gain a better understanding of them so that I may help solve some of the still unsolved conjectures. My goal for this summer was to become fluent in understanding the correlation constant of a matroid, a field, or graph. The correlation constant explains how the edges in a graph or the column vectors in a vector space configuration are correlated. We study different matrices and matroids and find correlation constants greater than 1 to improve the lower bound of the correlation constant for any field F.

Timothy Stavetski (University of Notre Dame IN)
Generating Nucleosome Crystal Visuals and Reference Frames
Mentor: Wilma Olson, Chemistry and Chemical Biology

Abstract: Nucleosomes, the first level of compaction for DNA, consist of DNA wrapped around histone proteins. My research involved reading about how nucleosomes are arranged in crystal structures. Crystal structures are identified with space groups, and so every nucleosome crystal is described by its space group. These crystal lattices and space groups are defined by the symmetry operations that produce them, but these symmetry operations do not lend themselves to analysis through other means. I came up with methods of writing symmetry data in terms of rigid body parameters. In addition I came up with different methods of visualizing the crystal structures that make the patterns easier to look at.

Michael Yang (Minerva Schools at Keck Graduate Institute CA)
Moving Beyond Observational Notions of Fairness
Mentor: Anand Sarwate, Electrical and Computer Engineering

Abstract: Computer scientists have already unleashed a bevy of technical definitions of fairness. It can be confusing for a newcomer to the field to make sense of the various definitions, their implications, and their shortcomings. In this article, we review previous and contemporary definitions of fairness. Earlier work in the field generally focuses on assessing and learning classifiers with respect to so-called observational notions of fairness. However, more recent work pointed out inherent limitations in these fairness definitions. That a classifier is fair with respect to a fixed notion of fairness is often not sufficient to address all intuitive unfairness in its predictions or the situation in which the algorithm is deployed. The upshot of this review is that "fair" algorithms must incorporate more information into the decision that is assessed merely by observational notions of fairness. Our review surveys three kinds of additional information: the causal structure between variables, the intersection of arbitrarily many protected categories, and the long-term effects of different predictions.

Return to REU Home Page
DIMACS Home Page
DIMACS Contact Information
Page last modified on July 24, 2018.