2024 DIMACS REU Proposed Project Descriptions

IMPORTANT! Read carefully because the application system has changed!

- If you are interested in any of the projects 1-12 you have to apply through the NSF ETAP system through this link: https://etap.nsf.gov/award/473/opportunity/6635
- If you are interested in any of the projects 13-21 you have to apply through the DIMACS REU application portal here: https://reu.dimacs.rutgers.edu/apply
- If you are interested in topics in both lists, you will have to apply TWICE using both links above.

*Note that you must be a U.S. Citizen or Permanent Resident to be eligible for these projects.

Cut/Flow Structures in Graphs with Terminals

Project #: DC2024-01
Mentor: Zihan Tan, DIMACS

Abstract: Graphs are becoming larger and larger every day, but the vertices of interest to us (called terminals) are potentially few. The paradigm of vertex sparsification explores the power and limit of compressing large graphs into smaller ones, while preserving important information, specifically cut and flow structures, with respect to terminals.
In this project, we will study a fundamental characterization problem (for cuts): Given a set T of terminals and a (2^T-2)-dimensional vector π whose coordinates are indexed by proper subsets of T, decide if there is a graph G that contains T, such that for all subsets S of T, π_S equals the value of the min-cut in G separating S from T\setminus S. We will also explore related questions concerning flow structures.

Approximability of Euclidean k-center and k-diameter

Project #: DC2024-02
Mentor: Karthik Srikanta, Computer Science

Abstract: In the Euclidean k-center problem we are given a set of points in the Euclidean plane and the goal is to partition them into k parts (called clusters) so as to minimize the maximum over all clusters, the radius of the minimum enclosing circle for the clusters. In the Euclidean k-diameter we are given the same input and would like to minimize the maximum diameter over all the clusters. In this project, you will try to bridge the gap for both these problems between their polynomial time approximation factor and their NP-hardness inapproximability factor.

Tight memory-sample tradeoffs for learning parity with noise (LPN)

Project #: DC2024-03
Mentor: Sumegha Garg, Computer Science

Abstract: With the increasing scale of machine learning tasks, it is important to study the feasibility of learning under memory constraints. A recent line of work provides strong memory-sample tradeoffs for many natural learning tasks. Particularly, breakthrough work of [Raz'16] shows that, for the problem of parity learning on n variables, any algorithm requires either n^2/25 memory or exponential number of samples to learn. Note that, a learner with n^2 memory can parity learn with just O(n) samples. Subsequent works extended these tradeoffs to other learning tasks. In this project, we will focus on the problem of learning parity with noise, which is a well-studied and motivated problem in coding theory and cryptography. The goal is to prove a tight memory-sample tradeoff for this problem, where the memory lower bound depends quadratically on the error parameter. Proving such a tradeoff, if true, would require significantly new techniques. Previous work on memory-sample tradeoffs for the LPN problem [GKLR'21] is based on the extractor-based approach of [GRT'18]. The student is expected to have some basic background in computation complexity and boolean function analysis.
[Raz'16] https://arxiv.org/abs/1602.05161
[GRT'18] https://arxiv.org/abs/1708.02639
[GKLR'21] https://arxiv.org/pdf/2107.02320.pdf

Advancing Homomorphic Encryption: Robust Security and Efficient Schemes with Large Keys

Project #: DC2024-04
Mentor: Periklis Papakonstantinou, Management Science and Information Systems

Abstract: Homomorphic encryption stands out as a highly adaptable cryptographic primitive. However, like many cryptographic methods, its security is traditionally assessed against adversaries operating within polynomial time constraints. This project explores a compelling question: what if the adversary exceeds polynomial time limits? Consider an attacker employing exponential time - a scenario that would compromise traditional homomorphic encryption schemes.
The primary focus of this project is the development of homomorphic encryption schemes secure against all-powerful adversaries. Additionally, we aim to construct schemes that not only bolster security but also exhibit exceptional efficiency. To achieve these dual objectives, we will explore a paradigm shift where the cryptographic keys are extremely large.

Projects in Algorithmic Game Theory

Project #: DC2024-05
Mentor: Daniel Schoepflin, DIMACS

Abstract: Algorithmic mechanism design (AMD) and algorithmic game theory (AGT) lie at the intersection of computer science, operations research, and economics. In AMD and AGT, classical problems in computer science (e.g., job scheduling) are examined through the lens of economics and game theory (e.g., strategic considerations) and classical problems in economics (e.g., auction design) are examined through the lens of computer science (e.g., designing efficient algorithms and approximation). How should one design algorithms that "real humans" interact with (e.g., should they be "simple" to understand) and how should the possible strategic behaviors of these human agents influence the way we think about the way we design algorithms (e.g., when can we make algorithms "robust" to strategic behaviors)? In AMD, we seek formal answers to these problems, looking to design algorithms with provable performance guarantees while carefully balancing tradeoffs between algorithmic simplicity/tractability, performance, and robustness.
Two more concrete areas that the REU students may explore are (1) designing algorithms taking into account models from behavioral game theory to better serve "real-world" agents and (2) thinking about how one should design algorithms for social good applications. The final form of the project will be determined alongside with the student within the framework of AMD and AGT. One need not have taken courses in economics or game theory, but courses in algorithms and discrete mathematics would be advised, since formal proofs would be a component of any project.

Differential privacy and data visualization

Project #: DC2024-06
Mentor: Anand Sarwate, Electrical and Computer Engineering

Abstract: Differential privacy is mathematical framework for quantifying the privacy risks when computing using sensitive data about individuals. This is a statistical/probabilistic notion of privacy which can be used to generate privacy-protecting summaries of data. The 2020 US Census used differential privacy to publishing results from the 2020 decennial census and many government agencies are trying to learn how to use differential privacy for their work. In many cases, private data is summarized through visualizations and descriptive statistics. Much of the work on differential privacy has focused on machine learning and data publishing. The student on this project will survey existing work in differential privacy and private visualization, choose a type of visualization, and evaluate the methods on data sets. The goal is to implement different data visualization techniques and see how privacy affects our qualitative (visual) understanding of the structure of data.
Research assistants will be responsible for researching current techniques for visualization with privacy and applying them to data sets from an application jointly chosen with the advisor. An ideal candidate will have some exposure to statistics and some experience with plotting data using a programming language (Python preferable, MATLAB is OK too).
The learning objectives are structured around three axes: (i) understanding the structure scholarly research, (ii) translating research into practice (through programming), and (iii) communicating research results. The first objective (i) will be addressed by completing a comprehensive literature review on the topic to get a sense of the "lay of the land" and opportunities to pursue unexplored directions. The second objective (ii) will involve programming and implementing privacy-preserving algorithms using the Python programming language. Additional support and mentoring will be provided by graduate student researchers working in the lab. The third objective (iii) will be achieved through presentation practice at group meetings and poster presentations with feedback from the professor and students in the lab.

Graph algorithms with weak and strong signals

Project #: DC2024-07
Mentor: Shahrzad Haddadan, Management Sciences and Information Systems

Abstract: For a given set of objects V={v1,v2,...,vn} and a function σ: VxV→ ℝ+, various graph problems can be considered. With the emergence of large graphs, the assumption of a priori knowledge of all values of function σ is no longer valid. Instead, the function value is estimated through querying a machine learning model at some cost; often precise estimation is viable by using resource intensive machine learning models. Alternatively, noisy estimates may be available at lower costs. For example, for a set of text documents, we may obtain the similarities between each pair by querying an expensive highly-accurate cross-attention model or a cheaper noisy embedding-based model.
In this setting, we investigate the classical graph problems of community detection (dense subgraphs) and finding influential vertices (set-cover), while allowing limited access to the graph through a number of queries that can be made.

Prerequisites: Candidates should be familiar with some programming language (C++, Java, Python, or R for programming and Python or R for visualization) and have a basic understanding of graph theory concepts and data structures.

Truth Learning in a Social and Adversarial Setting

Project #: DC2024-08
Mentor: Jie Gao, Computer Science

Abstract: Individuals gather information by both independent observations and communication with others in a social environment. Social Learning studies how dispersed and self-interested agents aggregate information. In this project we study whether social learning can enable truth learning. It is well known that learning can fail due to herding, where agents collectively focus on an early false consensus and ignore subsequent observations. Therefore a general goal in social learning is to characterize when truth learning is guaranteed. In this project we study the computational problem of social learning with different observation models, the environment of learning and social processes such as peer pressure, social influence, network effect as well as the existence of adversarial players.

Data Privacy and Applied Social Science Research

Project #: DC2024-09
Mentor: Ruobin Gong, Statistics

Abstract: We live in a digital world in which everyone produces an explosive amount of personal data. When these data translate into open-source databases accessible for research, they can be an invaluable resource in advancing science, fostering public knowledge, and improving research reproducibility. The richer, more accurate, and more detailed the data are, the more informative they can be for the scientific purpose. At the same time, the custodian of these data have more responsibility in ensuring the confidentiality of the individual data contributors. Recent developments in the literature of formal privacy, notably differential privacy, presents a promising, mathematically rigorous framework to conceptualize data privacy protection. The challenge is how to strike the right balance between effective privacy protection and the usability of the data for research purposes.
This project evaluates the potential of modern privacy methods for the various data-driven disciplines, with a focus on the quantitative social sciences including (but not limited to) the behavioral sciences, political science, sociology, and economics. The project asks the following questions: what is the current state and mindset of data privacy protection in the subject matter discipline? For benchmark data products (including important surveys and official databases) what are the current protocols of confidentiality protection, and what are the viable methods and standards moving forward in light of new developments in formal privacy? In what ways can new data privacy standards help promote data sharing, transparency, and open science? How can existing data analysis methods for privacy-protected data apply to concrete use cases?
This project will be suitable for a student specializing in statistics and/or data science with a demonstrated interest in the quantitative social sciences (such as those disciplines listed above). The ideal candidate is a student who has taken, and excelled in, courses both in statistics/data science and in a relevant subject matter science. Prior experience with writing data analysis programs (in R or Python) is essential, as is a passion to work with data in general.

The student will conduct literature reviews, connect and potentially extend current privacy protection methodologies, perform data analysis as necessary, and summarize research findings into a rigorous academic report. Key elements of the project include the investigation of important open-access survey and official data products and their current confidentiality protection protocols, as well as the implementation of state-of-the-art data privacy protection algorithms and anlysis programs for privacy-protected data as the situation may require.

Optimization, learning and high-dimensional macroscopic limits

Project #: DC2024-10
Mentor: Pierre Bellec, Statistics

Abstract: The last decade has seen the emergence of new phenomena where complex statistical learning problems such as high-dimensional regression and classification can be accurately summarized by simple systems of equations. These simple systems of equations characterize the high-dimensional limit of the statistical learning problem at hand and provide new insights on regularization and the choice of statistical estimators in high-dimension. The project will explore problems in this line of research, requiring and developing skills in probability, statistical/machine learning, numerical programming and computational linear algebra.

Genome folding and function: from DNA base pairs to nucleosome arrays and chromosomes

Project #: DC2024-11
Mentor: Wilma Olson, Chemistry and Chemical Biology

Abstract: Despite the astonishing amount of information now known about the sequential makeup and spatial organization of many genomes, there is much to learn about how the vast numbers of DNA base pairs in these systems contribute to their three-dimensional architectures and workings. The chemical structure of DNA contains information that dictates the observed spatial arrangements of base pairs within the double helix and the susceptibilities of different base sequences to interactions with proteins and other molecules. Our group has been developing software to analyze the spatial and energetic codes within known DNA structures and using these data in computational studies of larger nucleic acid systems. By representing the DNA bases as sets of geometric objects, we can capture the natural deformability of DNA as well as the structural details of ligand-bound interactions (1,2). There are now sufficient data to examine the influence of sequence context, i.e., the effects of immediate neighbors, on the apparent motions of the bases and to develop simple functions that capture this information. The observed behavior can be incorporated in computational treatments to improve our understanding of macromolecular systems in which DNA sequence and protein binding play important roles. For example, DNA elements that regulate gene expression often lie far apart along genomic sequences but come close together during genetic processing. The intervening residues form loops, mediated by proteins, which bind at the sequentially distant sites. We have developed methods that take account of both DNA and protein in simulations of loop formation and successfully captured experimental observations (3-5). Our recent studies of long-range communication between regulatory elements on short nucleosome-decorated DNA (6) provide a basis for linking the 'local' arrangements of nucleosomes in these systems to higher-order macromolecular structures, such as the looped architectures detected within chromosomes and the circular pieces of genomic DNA associated with a variety of cancers. We are developing new algorithms to characterize the spatial arrangements of nucleosome-decorated DNA circles and incorporating the collective information in nucleosome-level depictions of chromosomal DNA (7,8). The build-up of structural components in our treatment of DNA and chromatin offers new insight into the stepwise processes that underlie genome organization and processing.

1. Young, R.T., L. Czapla, Z.O. Wefers†, B. Cohen†, W.K. Olson. (2022) Revisiting DNA sequence-dependent deformability in high-resolution structures: effects of flanking base pairs on dinucleotide morphology and global chain configuration. Life (Basel) 12:759. DOI: 10.3390/life12050759
2. Olson, W.K., Y. Li, M.O. Fenley. (2022) Insights into DNA solvation found in protein-DNA structures. Biophys. J. 121:4749-4758. DOI: 10.1016/j.bpj.2022.11.019
3. Wei, J., L. Czapla, M.A. Grosner, D. Swigon, W.K. Olson. (2014) DNA topology confers sequence specificity to nonspecific architectural proteins. Proc. Natl. Acad. Sci. USA 111:16742-16747. DOI: 10.1073/pnas.1405016111
4. Clauvelin, N., W.K. Olson. (2021) Synergy between protein positioning and DNA elasticity: energy minimization of protein-decorated DNA minicircles. J. Phys. Chem. B 125:2277-2287. DOI: 10.1021/acs.jpcb.0c11612
5. Young, R.T., N. Clauvelin, W.K. Olson. (2022) emDNA - A tool for modeling protein-decorated DNA loops and minicircles at the base-pair step level. J. Mol. Biol. 434:167558. DOI: 10.1016/j.jmb.2022.167558
6. Clauvelin, N., P. Lo†, O.I. Kulaeva, E.V. Nizovtseva, J. Dias, J. Zola, M. Parashar, V. Studitsky, W.K. Olson. (2015) Nucleosome positioning and composition modulate in silico chromatin flexibility. J. Phys.: Condens. Matter 27:064112. DOI: 10.1088/0953-8984/27/6/064112
7. Todolli, S., P.J. Perez, N. Clauvelin, W.K. Olson. (2017) Contributions of sequence to the higher-order structures of DNA. Biophys. J. 112:416-426. DOI: 10.1016/j.bpj.2016.11.017
8. Todolli, S., R.T. Young, A.S. Watkins†, A. Bu Sha†, J. Yager†, W.K. Olson. (2021) Surprising twists in nucleosomal DNA with implication for higher-order folding. J. Mol. Biol. 433:167121. DOI: 10.1016/j.jmb.2021.167121

†-Undergraduate research student

Genomic data-guided computational modeling of cancer

Project #: DC2024-12
Mentor: Subhajyoti De, Rutgers Cancer Institute of New Jersey

Abstract: Cancer is the second leading cause of death in adults, claiming the lives of more than half a million Americans every year. Tumor starts from a single rogue cell in the body, which becomes defective, and the clones of that initial cell start to grow (or die out) and acquire new mutations as the populations of tumor cells grow. During treatment, the patient initially responds, and the tumor shrinks, but remaining cells inevitably develop mutations that give them drug resistance, and the tumor starts to grow again until the patient dies. It is becoming apparent that computational algorithms and mathematical frameworks can identify subtle patterns in patient profiles, which can help with early detection and personalized treatment, improving their outcome. We will develop computational model to capture principals of population dynamics of tumor cells during development and emergence of resistance against cancer drugs, using concepts based on Galton Watson branching process and reaction diffusion system. We will combine computational models with clinical data from cancer patients to investigate whether we can predict the timeline of tumor development and thereby (i) detect cancer at an early stage or (ii) identify emergence of drug resistance very early during treatment. This project will develop fundamental evolutionary concepts, implement relevant computational models, and apply them in clinical settings to help improve cancer care.

↑↑↑ FOR PROJECTS ABOVE THIS LINE PLEASE APPLY HERE: https://etap.nsf.gov/award/473/opportunity/6635 ↑↑↑

↓↓↓ FOR PROJECTS UNDER THIS LINE PLEASE APPLY HERE:https://reu.dimacs.rutgers.edu/apply ↓↓↓

Training LLMs on Mathematica packages

Project #: DC2024-13
Mentor: Ayush Khaitan, Mathematics

Abstract: The bread and butter of physicists and differential geometers the across the world is performing long calculations by hand, that can sometimes run over several pages. Although software and Mathematica packages exist, learning the syntax for them can often be a hurdle, and leads to a much reduced use. LLMs like ChatGPT have recently led to an increased use of programming by non-programmers, by their sheer ability to write code. However, due to the lack of training data, these LLMs are unable to write code using these Mathematica packages and software. We intend to train LLMs to be able to write code using these specialized packages. We will focus on the Mathematica package Ricci, developed by John Lee. This would be tremendously helpful for many physicists and mathematicians across the world. After training, we hope to calculate conformal invariants like higher order obstruction tensors of the Fefferman-Graham ambient metric. It is expected that this work can be generalized to other software too.

When Fourier analysis meets ergodic theory and combinatorics

Project #: DC2024-14
Mentor: Mariusz Mirek, Mathematics & Leonidas Daskalakis, Mathematics

Abstract: Pointwise convergence in ergodic theory is the most natural (as well as the most challenging to establish) mode of convergence. This line of inquiry began in the early 1930's with von Neumann's mean ergodic theorem and Birkhoff's pointwise ergodic theorem and admitted profound extensions, particularly following Furstenberg's ergodic proof of a celebrated Szemeredi's theorem (on the existence of arbitrarily long arithmetic progressions in subsets of integers with positive density) and the work of Bourgain in the late 1980s.

After Bourgain's groundbreaking papers, which were acknowledged in his Fields Medal announcement, Elias Stein began to call this area Discrete Analogues in Harmonic Analysis due to its discrete flavor and interactions with number theory, additive combinatorics, probability, and Fourier analysis, among other fields. Our focus will be on studying various intriguing open problems at the intersection of ergodic theory, harmonic analysis, additive combinatorics, and analytic number theory.

Mathematical Investigations of Physics in One Space Dimension

Project #: DC2024-15
Mentor: Shadi Tahvildar-Zadeh, Mathematics

Abstract: Imagine the universe consisted only of beads of different size and color on a single string, with some rules about how they would interact, combine, separate from, or bounce off one another. How would familiar notions of classical physics such as gravity or electromagnetism look like in such a universe? What about quantum phenomena? How far can one go on this quest? Is it feasible to look for a one-dimensional "theory of everything"?
The goals of this REU project are to
- introduce students to the mathematical tools they need in order to understand these questions,
- help them use those tools to find satisfactory answers to some of the questions
- give them the vocabulary they need in order to ask many more such profound questions, not just about this cartoon world, but also about the actual world we live in.
The emphasis will be on mathematical rigor and exact analysis. Previous knowledge of physics is welcome but not required. Mathematical knowledge at the level of Calculus 3 and Linear Algebra is required, as is familiarity with at least one computational software such as Matlab, Maple, or Mathematica.

Projects in Pure Math

Project #: DC2024-16
Mentor: Faculty of the Math Department, Mathematics

Abstract: The Math REU Coordinator is Prof. Kristen Hendricks. Several projects will be run in 2023 by members of the pure math faculty, in addition to the ones described below. Applicants interested in a project in pure mathematics should list this project in their application and indicate their specific Math interests; further details on the projects will be worked out in the Spring.

Simplifying explicit equations of fake projective planes

Project #: DC2024-17
Mentor: Lev Borisov, Mathematics

Abstract: Fake projective planes are special complex surfaces (four real dimensions). It is known that there are exactly 100 of them. Some of these surfaces (22 so far) are given by explicit equations which allows one to investigate their more subtle features, which is the goal of the project. Computations will involve using various software packages, most notably Mathematica, Magma and/or Macaulay2. An ideal participant will have some knowledge of algebraic geometry and some familiarity with mathematical computing.

Curves on topological surfaces

Project #: DC2024-18
Mentor: Feng Luo, Mathematics & Hongbin Sun, Mathematics

Abstract: Counting the number of closed geodesics on a hyperbolic or flat surface of lengths up to a given number is a classical problem in geometry. It origins from the Gauss circle problem on determining how many integer lattice points within in a given circle. In 1959, Huber proved that the number of closed geodesics on a hyperbolic surface of length at most L is asymptotic to exp(L)/L. More recent breakthrough work by Mirzakhani and others computed the number of simple closed geodesics. In this REU project, we will investigate the number of homotopy classes of loops on punctured surfaces equipped with ideal triangulations. The traditional geodesic length will be replaced by the geometric intersection number. It is expected that similar results will hold.
Another topic in this project is to construct explicit pairs of essential simple closed curves on surfaces that have low distance in the curve graph. For each surface S, Harvey introduced the curve graph of S, whose vertices are isotopy classes of essential simple closed curves on S, and two vertices are connected by an edge if they have disjoint representatives. The curve graph is a connected, infinite diameter graph. It is also a very important tool for studying Teichmuller spaces of surfaces and hyperbolic 3-manifolds. It turns out that for small d greater than 3, we do not have many explicit examples of pairs of curves that have distance exactly d. In this project, we will construct explicit pairs of curves that have distance exactly 4, 5 and 6.
The students should have a solid knowledge of surface topology, basic algebraic topology, and the method of generating function from discrete mathematics. Some knowledge of Riemannian geometry will be very helpful, but is not required.

[1] Saul Schleimer's unpublished notes "Notes on the complex of curves"
[2] Mosher, Lee, "Tiling the projective foliation space of a punctured surface", Trans. Amer. Math. Soc. 306 (1988), no. 1, 1-70

Simulating Random Origami

Project #: DC2024-19
Mentor: Ian Tobasco, Mathematics

Abstract: Origami is a familiar concept: fold a sheet of paper along a particular sequence of lines, and you can make a paper crane, or perhaps even a human face (if you try hard enough!). In a sense, the known folding patterns are extreme examples - they are too carefully designed to tell us what origami really is. This project will explore typical, or random origami patterns with large numbers of folds. The question: are there secret patterns hiding in random collections of folds that only emerge for sufficiently many folds? This project will take a computational approach and the successful REU student will build a computer program to simulate random origami patterns with guidance from the faculty mentor. Along the way, the student will also gain some experience with theorems from differential geometry with the goal of finding their origami variants in the simulation data. A background in computer programming is a must for this project, and the student will be asked to code in MATLAB. So is a good working knowledge of linear algebra. Some background in differential geometry or differential equations would be helpful but is not strictly required.

Triple Product L-function

Project #: DC2024-20
Mentor: Michael Woodbury, Mathematics

Abstract: In number theory it is often the case that understanding the behavior of certain so-called L-functions at specific values leads to arithmetic results.
In this project we intend to consider analogues of the local factors in the case of groups defined over finite fields. This will require learning about and applying the theory of representations of finite groups. It could also involve enlisting the help of a computer to do calculations and collect data. Depending on progress made, this could lead to studying the representation theory of groups defined over real and/or p-adic numbers. We may also explore possible applications of our formulas.
Interested students should have taken at least one semester of abstract algebra, meaning that they have studied finite groups and some field theory. Programming experience is preferred. Having taken courses in Number Theory, Representation Theory, Complex and/or Real Analysis would be a plus.

For details about this project please check this pdf file

Vortices in geometry and physics

Project #: DC2024-21
Mentor: Paul Feehan, Mathematics

Abstract: Vortices were first proposed by physicists Ginzburg and Landau as mathematical models for superconductivity. Their vortex equations are non-linear first order PDEs for a U(1) connection A and a field phi over two-dimensional Euclidean space. Within the mathematical gauge theory community, their key properties were proved by Cliff Taubes (1980) and Arthur Jaffe and Cliff Taubes (1980). These basic Ginzburg-Landau vortex equations were extended by Steve Bradlow (1990 and 1991) to the case of unitary connections A on and sections phi of a Hermitian vector bundle E over Riemann surfaces and then complex Kaehler manifolds of any dimension, first in the case where E is a line bundle and then a bundle of arbitrary complex rank. In this project, we will extend Taubes' and Bradlow's work by learning how to use gluing to construct solutions to the vortex equations in increasing generality: (1) U(1) vortices over the complex plane C, (2) U(1) vortices on Hermitian line bundles over Riemann surfaces, (3) U(r) vortices on Hermitian vector bundles of rank r over Riemann surfaces, and finally (4) U(r) vortices on Hermitian vector bundles of rank r over complex Kaehler manifolds. Time permitting, we will also explore methods to extend these gluing results from the category of complex Kaehler manifolds to symplectic manifolds.

Understanding a platform's strategic entry into the marketplace

Project #: DC2024-22
Mentor: Xintong Wang, Computer Science

Abstract: Recent years have witnessed the rise of online marketplaces (e.g., Amazon, App Store) that create great convenience to facilitate matchings and improve economic efficiency. While these platforms use third-party sellers' data to facilitate better matchings, they have also leveraged those sales data to inform the decisions of launching their own products. Prominent examples include Amazon Basics and apps by Apple. Such strategic entries by the platform can serve multiple purposes, including price/quality control, push for innovation, or simply value capture that may incur unfair competitions due to the platform's dual role as both an intermediary and a seller.The goal of the project is to study the effects of platform entry, as informed by marketplace data, on third-party sellers as well as the efficiency and diversity of the ecosystem. Built on an initial model, we can further study and evaluate the effectiveness of various regulatory policies on platform's use of data. The methods to study this project can be (1) theoretical: e.g., establish an online learning model to study the impact of platform entry on seller's exploration and exploitation within an (unknown) product space;(2) empirical: e.g., study real-world datasets (e.g., Keepa) to summarize and model platform's strategic entries;(3) AI and agent-based modeling: e.g., build an agent-based model to capture platform's strategic entries and use AI (e.g., RL) to derive the optimal strategy for either the platform or third-party sellers. This can be combined with (1) or (2).

Correlation inequalities in Algebra, Combinatorics and Probability

Project #: DC2024-23
Mentor: Swee Hong Chan, Mathematics


Projects in Graph Theory

Project #: DC2024-24
Mentor: Bhargav Narayanan, Mathematics


Return to REU Home Page
DIMACS Home Page
DIMACS Contact Information
Page last modified on December 15, 2023.