**DIMACS is very proud for the achievements of the 2018 DIMACS REU students during their summer research experience. This page includes the abstracts of the final papers that the students submitted to the program, summarizing their summer research.**

Andrew Brettin (

Mentor:

**Abstract: **Cancer is a genetic disease which begins from a single aberrant progenitor cell, which successively divides into pairs of daughter cells with similar reproductive capacities. As the tumor grows, many distinct genetic lineages may proliferate throughout the neoplasm, resulting in significant intratumor heterogeneity. Such high levels of genetic diversity within tumors is associated with low survival rates for patients, as genetically-distinct cell variants have differential sensitivities to current therapies. Despite its importance, how intratumor heterogeneity develops is not well understood. However, mathematical models can provide insight into potential causes. This work applies a stochastic operator-splitting method developed for simulating chemical reaction-diffusion processes to model neoplastic growth. It is shown that high differential birth-rates, high diffusion rates and early onset of mutations can result in substantial levels of intratumor heterogeneity.

Dalton Burke (

Mentor:

**Abstract: **With the rise of big data and machine learning, computers are used frequently to process and compute large quantities of data. An incoming job is parallelized, divided among different servers. Parallelizing introduces the issue of stragglers, the slowest workers. It is known that MDS coding schemes can be utilized so that the last servers to finish do not slow down the entire job. The known hierarchy of job scheduling policies from most efficient to least efficient is JSQ (join shortest queue), power-of-d, round robin, and random. The service time of balanced and incomplete block design (BIBD) is compared to the service time of other schemes in the hierarchy. BIBD is also tested with other models for which the hierarchy is unknown. At first, simplifying assumptions are made and then, a more realistic model of service time including both job specific randomness and server specific randomness is discussed. As a result, in certain scenarios, BIBD outperforms the other schemes in the hierarchy.

Debra Chait (

Mentor:

**Abstract: **The Apollonian circle packing, generated from three mutually-tangent circles in the plane, has inspired over the past half-century the study of other classes of space-filling packings, both in two and in higher dimensions. Recently, Kontorovich and Nakamura introduced the notion of crystallographic sphere packings, n-dimensional packings of spheres with symmetry groups that are isometries of H^{n+1}, which can be related to configurations of planes in H^{n+1} for various quadratic forms in n+2 variables. When applied in conjunction with the Koebe-Andreev-Thurston Theorem, Kontorovich and Nakamura's Structure Theorem guarantees crystallographic packings to be generated from polyhedra in n=2. The Structure Theorem similarly allows us to generate packings from the reflective extended Bianchi groups in n=2 by applying Vinberg's algorithm to obtain the appropriate Coxeter diagrams. In n>2,
the Structure Theorem when used with Vinberg's algorithm allows us to explore whether certain Coxeter diagrams in H^{n+1} for a given quadratic form admit a packing at all. Kontorovich and Nakamura's Finiteness Theorem shows that there exist only finitely many classes of superintegral such packings, all of which exist in dimensions n<21. In this work, we systematically determine and enumerate crystallographic sphere packings arising from polyhedra on up to seven vertices, all crystallographic packings arising from Bianchi groups, and all known examples of crystallographic packings arising from higher dimensional quadratic forms.

Gabriel Eiseman (

Mentor:

**Abstract: **I investigated the minimality of a construction for the fourth tangent circle to three tangent circles (aka the Apollonian circle). Baragar and Kontorovich (my mentor) discovered a general 7 step construction and I showed that under certain assumptions about arbitrary points it is indeed minimal. I also found a 5 step construction for when the circles' centers form a right triangle and showed it and known constructions for isosceles (5 steps) and equilateral (3 steps) triangles are minimal. I am pretty sure the assumption I made about arbitrary points, which is that only a certain subset of them actually need to be considered, is correct, but have not formalized the argument.

Christopher Espana (

Mentor:

**Abstract: **Pulmonary embolism affects hundreds of thousands of people across the United States and can cause complications such as hemoptysis or cardiac arrest. Pulmonary embolism is characterized by an embolus becoming lodged in a pulmonary artery, where the origin of this embolus is often attributed to deep vein thrombosis. Atrial fibrillation often causes emboli to develop in the heart and propagate through the body. Through the use of the Myocardial Infarction Dara Acquisition System (MIDAS), a database of discharge data on myocardial infarction patients in New Jersey, we seek to study the relationship between atrial fibrillation and pulmonary embolism, in order to understand if there exists a relationship between the diseases.

Heman Gandhi (

Mentor:

**Abstract: **The structure of Γ, the bialgebra of stable Grothendieck polynomials,
is considered and a basis is shown to contain various classes of of
polynomials, particularly hook shapes, 2-row shapes, and certain 3-row
shapes.

Nicholas Georgiou (

Mentor:

**Abstract: **Through a combination of neuroscience, robotics, and computer science,
neurorehabilitation robots (NRs) can help patients with neurological conditions recover through
therapy. Many times, patients may have impaired ability in moving body parts such as their arms.
The Bionik InMotion ARM is used as the NR platform for this project. The NRs give optimal
results when patients have cognitive engagement (CE) while they perform highly repetitive tasks
involving the affected body part. An efficient and effective approach to NR therapy while
maintaining CE is performing these highly repetitive tasks in the form of serious games. The NR
adapts to the patients' behavior or ability in order to maintain active participation from the patient
and to encourage improvement. A key to improving the effectiveness of NR therapy is
understanding the neurophysiological components to movement of specific body parts that can
help brain plasticity and lead to better motor recovery, so another aspect of this project was
developing a built-in measure of these components into the robotic system. This is done through
the 128-channel Biosemi Active electroencephalography (EEG) system, along with motor
behavior metrics of gameplay performance. So, the focuses on developing comprehensive
representations of CE when performing neurorehabilitation tasks during therapy and creating new
adaptive rehabilitation strategies that can further help patients in their recoveries.

Ryan Gross (

Mentor:

**Abstract: **Approximate computing is a field of statistical inference techniques that can be used to produce estimates given complex sets of data. Approximate Bayesian Computing (ABC) and Approximate Confidence Distribution Computing (ACC) are two such methods that do not require the specification of a likelihood function, and hence can be used to estimate posterior distributions of parameters for simulation-based models. We look to apply these methods to a large data set, namely one containing pedestrian entrance and exit data for Madison Square Park in New York City. Given the technical and financial restraints of the counting procedure, much of the data is either missing or otherwise prone to error. Therefore, we propose several models to characterize the pedestrian traffic flow throughout the park. Then, simulation techniques are applied to replicate the missing data and produce distributions of simulated data. This leads to the application of ABC and ACC methods, which are used to ultimately produce accurate estimates of the total number of park users over a given time period.

Caitlin Guccione (

Mentor:

**Abstract: **Cancer is the second leading cause of death in the United States and yet it only has two main treatments, radiation and chemotherapy. A more efficient way to eliminate cancerous cells is with a targeted approach. In order to create more effective precision medication, there exists a
need to understand how cancer develops and and to determine which cancerous mutations are most
frequent in patients. The optimal way to answer these questions is by sequencing cancer tumors and
tracking mutations over time with the help of mathematical trees. We use two genetic distances,
Hamming and Nei to help structure the trees. We conclude that Nei's distance does a better job of
accurately reflecting the changes in mutations over time and thus can be used in the future to track
the evolution of cancerous cells.

Xiaotong Gui (

Mentor:

**Abstract: **This study focuses on the analysis of 3D surface measurements and quality prediction of 3D-printed dome-shaped objects using multimodal data fusion. Dimension, profile, and surface roughness were measured and represented in image data. Dimension reduction techniques were employed for extracting spatial patterns from the measurement images. Quality metrics were developed using profile deviation and surface roughness. In the end, classification and regression models were built to predict quality. The results propose feature extraction from high-dimensional image data as a promising technique for efficient and automated quality inspection.

Scott Harman (

Mentor:

**Abstract: **The curve-shortening flow is a process that modifies curves in the plane according to a certain partial differential equation. With certain starting conditions, the flow will shorten the perimeter and cause the curve to collapse to a circle, and then to a point, in finite time. The flow in the plane has been extensively studied, but we extend the notion of the flow to surfaces. We investigate how the evolution equation must be adapted so that the flow is well-defined, and then analyze certain surfaces and families of curves that provide a solution on those surfaces. We then pose certain open questions that will be studied further.

Aubrey Hormel (

Mentor:

**Abstract: **To move a group of agents over a graph from some initial to arbitrary target positions, I implement previously proposed algorithms for formation control and agent path scheduling. The arbitrary graph simple and connected, with unit edge lengths between adjacent vertices. Moreover, agents in the system are indistinguishable and allowed to travel to the best position in the target formation as determined by the control policy. These algorithms ensure that over all agents, a minimum distance is traveled, give a definite amount of time for agents to achieve the target formation, and eliminate the possibility of agent collision.

Rahul Ilango (

Mentor:

**Abstract: **Understanding the computational difficulty of the Minimum Circuit Size Problem (MCSP) dates back to the 1950s and has wide-ranging implications in theoretical computer science. Since MCSP is not known to be NP-hard and is believed not to be in P, it is seen as a promising NP-intermediate candidate. However, despite extensive study of MCSP, there is little evidence that it is not NP-hard. In a recent development, Murray and Williams show that MCSP is not NP-hard under TIME(n^{0.49}) projections. Murray and Williams also conjecture that MCSP is not NP-hard under logtime-uniform AC^{0} reductions. We show that any super-constant multiplicative approximation of MCSP is not hard for NP (in fact, not hard for PARITY) under even non-uniform AC^{0} many-one reductions. This result is surprising (and nearly tight) because Allender and Hirahara show that there is a constant-factor approximation to MKTP, a problem closely related to MCSP, that is hard for PARITY under non-uniform NC^{0} many-one reductions. To much frustration, this PARITY hardness result is not known for MCSP, and we show that the natural way to extend it to MCSP is actually impossible.

Froylan Maldonado (

Mentor:

**Abstract: **The idea of matroids dates back to the 1930s where mathematician
Hassler Whitney introduces the abstraction of linear independence.
After the subject was introduced several other mathematicians started
contributing to the field; it soon expanded into different areas of interest like:
coding theory, topology, and graph theory. For what we're doing, the reader
doesn't require any background knowledge of matroids; only the basics of
abstract algebra and graph theory.
Matroids have a lot of interesting properties and
also have many different ways of being defined. They can be defined
from matrices, basis of different vector spaces, graphs,
projective planes, and etc. Even though many of these objects don't
seem to have a clear relantionship, they all are
considered matroids. This lead us to believe that studying some
more trivial cases of matroids might possibly
reveal some unknown nuances of more complicated matroids.

Daniel Nakhimovich (

Mentor:

**Abstract: **Graphs are an excellent form for the visualization of data because they clearly show how individual data points are connected. However, for large sets of data, a direct visualization of a graph is indecipherable to the human eye. Separating sets and k-connected components are interesting structures in graphs that highlight critical data points and clusters of highly connected points respectively. We developed an algorithm that uses minimum separating sets to decompose a graph into a hierarchy of k-connected components. The complexity of the algorithm depends linearly on the number of k-connected components in the graph. For each k-connected component K=(V,E), however, the complexity of finding its minimum separating set is $O(nm\binom{n}{2})$ where n=|V| and m=|E|. By using different approximate procedures this complexity can be improved to O(nm) or at more cost to accuracy to O(n+m). Performing the separating set decomposition creates a tree-structured map of the decomposed graph that more easily shows the connectivity of the graph and consequently the data it represents.

Ruby Ortiz (

Mentor:

**Abstract: **The chromatic polynomial is shown to be a characteristic
of the graph that demonstrates a relation. Then this idea is
developed more with matroids and a diversity of graphs. A
matroid can take up different forms from matrices to graphs
to simple sets. There are multiple proofs of each graph’s
chromatic polynomial using the Deletion-Contraction Relation,
also presented in Huh's article. The deletion-contraction
develops an interesting relation between graphs that can then
be applied further to matroids. The chromatic polynomial of a
graph has and continues to expand beyond 2-D graphs and even
3-D graphs, which is where work continues to develop.

Ryan Rice (

Mentor:

**Abstract: **The problem of learning the mean of a distribution from samples with
at most an ε-fraction of adversarial corruptions has been the
subject of much recent work. Several current breakthroughs are
polynomial time algorithms for high dimensional data in specific
classes of distributions. We consider the problem of robust mean
estimation for "resilient" distributions and show that with an
additional preprocessing step and different analysis, the current best
polynomial time algorithm will
only need O(log n) iterations of outlier removal in one dimension,
as opposed to O(n) in the worst case. This is significant due to the
costly outlier identification process involving principal component
analysis or semidefinite programming in these algorithms. Our work
could provide insight for future approaches on improving the
practicality of robust mean estimation for non-gaussian distributions.

Evaristo Rodriguez (

Mentor:

**Abstract: **For hospitals one of the main problems in the process of generating mortality and morbidity data that is common to all countries, this requires both statistical and medical expertise. Medical doctors give a list of billing quotes, called ICD, however, this part is manually input, it is very time consuming, and increases the possibility for errors. This makes it difficult for the medical team to clearly assess the data. Which is why the creation of a web application with the utilization of the IDE "RStudio" and its many packages, it would facilitate the use of ICD-9 codes for medical teams. The app would help this process by helping the doctors get a better diagnosis and being more efficient in every possible way, by eliminating the possibility for manually input data errors, being a fast resource, and a trusted app.

Sherry Sarkar (

Mentor:

**Abstract: **Schoning's algorithm and the break algorithm are SAT heuristic based Markov chains
that have performed exceptionally fast compared to a brute force exponential approach. These
Markov chains have been known to work well on random SAT instances over industrial SAT instances.
In the first half of this paper, we discuss some sufficient conditions and examples in which one
algorithm would perform better than the other in an effort to answer the general question of which
SAT solver works better given a random SAT formula. In the second half of this paper, we construct
a set of operations on Markov chains that preserve mixing time. Two operations in particular have
been stated and proved here - the splitting of a state and the merge of two states.

Pejmon Shariati (

Mentor:

**Abstract: **Matroid Theory is one of the current hot topics in the mathematics world. My project is to gain a better understanding of them so that I may help solve some of the still unsolved conjectures. My goal for this summer was to become fluent in understanding the correlation constant of a matroid, a field, or graph. The correlation constant explains how the edges in a graph or the column vectors in a vector space configuration are correlated. We study different matrices and matroids and find correlation constants greater than 1 to improve the lower bound of the correlation constant for any field F.

Timothy Stavetski (

Mentor:

**Abstract: **Nucleosomes, the first level of compaction for DNA, consist of DNA wrapped around histone proteins. My research involved reading about how nucleosomes are arranged in crystal structures. Crystal structures are identified with space groups, and so every nucleosome crystal is described by its space group. These crystal lattices and space groups are defined by the symmetry operations that produce them, but these symmetry operations do not lend themselves to analysis through other means. I came up with methods of writing symmetry data in terms of rigid body parameters. In addition I came up with different methods of visualizing the crystal structures that make the patterns easier to look at.

Michael Yang (

Mentor:

**Abstract: **Computer scientists have already unleashed a bevy of technical definitions of fairness. It can be confusing for a newcomer to the field to make sense of the various definitions, their implications, and their shortcomings. In this article, we review previous and contemporary definitions of fairness. Earlier work in the field generally focuses on assessing and learning classifiers with respect to so-called observational notions of fairness. However, more recent work pointed out inherent limitations in these fairness definitions. That a classifier is fair with respect to a fixed notion of fairness is often not sufficient to address all intuitive unfairness in its predictions or the situation in which the algorithm is deployed. The upshot of this review is that "fair" algorithms must incorporate more information into the decision that is assessed merely by observational notions of fairness. Our review surveys three kinds of additional information: the causal structure between variables, the intersection of arbitrarily many protected categories, and the long-term effects of different predictions.

Return to REU Home Page

DIMACS Home Page

DIMACS Contact Information

Page last modified on July 24, 2018.

DIMACS Home Page

DIMACS Contact Information

Page last modified on July 24, 2018.