2014 DIMACS REU Proposed Project Descriptions

Please revisit this page frequently, additional projects will be posted through January.


Project #: DC2014-01

Mentor: Jaroslaw Zola, jaroslaw.zola (at) rutgers.edu, Rutgers Discovery Informatics Institute
Project Title: Interactive System for Exploratory Analysis of Large Protein Graphs

Pairwise similarity between many proteins can be naturally represented as a graph in which nodes correspond to proteins, and edges are introduced whenever two proteins meet some predefined similarity criteria. Such similarity graphs provide domain experts with many valuable insights about, for example, forming of protein complexes, transfer of protein function, etc. The goal of this project is to develop an interactive web-based system to enable exploration of annotated similarity graphs with millions of nodes and edges. The system will be based on a client-server architecture, where client will use Ajax/Javascript to visualize data and to interact with a user, while server will be responsible for graph management and queries processing. A summer REU intern will be involved in implementing the client side, and will help to design required data exchange protocols.

Requirements: Prior experience with Javascript and JSON are desired. Familiarity with the D3js library will be helpful.
* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-02

Mentor: Vinay Vaishampayan, vavaishampayan (at) icloud.com, DIMACS
Project Title: Dissections of Polyhedra

Dissections of polyhedra have been studied by geometers for thousands of years, are much loved by puzzle creators and solvers, and have recently found interesting applications in coding theory. Given two polyhedra A and B in n-dimensional Euclidean space, can we cut A into pieces A1, A2,…,Am and B into pieces B1,B2,…,Bm, where m is a finite number, such that Ai is congruent to Bi, i=1,2,…,m. In this summer project we will explore computational and visualization tools to help us construct good dissections, and will explore other applications to the information sciences. 

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-03

Mentor: William Pottenger, billp at dimacs.rutgers.edu, DIMACS and Computer Science
Project Title:  Higher Order Learning

In our work we are developing a framework termed Higher Order Learning that builds models by leveraging associations in the form of latent relationships in an entity-relation graph. A general multi-attribute entity-relation graph has nodes that are people, places and things, etc. Higher Order Learning violates the underlying assumption made in traditional machine learning algorithms that records are independent and identically distributed(Taskar et al, 2002). Although this assumption simplifies the underlying mathematics of statistical models and of the corresponding parameter estimation procedures, it actually does not hold for many real world applications (Getoor & Diehl, 2005). Machine learning methods based on Higher Order Learning leverage relations between objects and features in an entity-relation graph and in doing so, operate on a much richer data representation compared to the traditional feature vector form.

In this research, Higher Order Learning techniques have been investigated and applied to a number of interesting datasets including Border Gateway Protocol (cybersecurity), E-Commerce (web mining), and Nuclear Detection (homeland security). The approach has been shown to outperform the popular Support Vector Machine algorithm on benchmark textual data sets as well as real-world streaming data Ganiz, Lytkin & Pottenger). REU summer interns will have an unparalleled opportunity to explore Higher Order Learning algorithms and applications in the context of law enforcement and other domains.


Project #: DC2014-04

Mentor: William Pottenger, billp at dimacs.rutgers.edu, DIMACS and Computer Science
Project Title:  Entity Resolution

Emergencies invariably require crisis managers to field numerous calls for service such as 911, radio reports by police and fire teams, medical reporting, infrastructure support teams, etc. Technologies for Entity Resolution can be applied to handle the confusion by identifying geospatial, temporal, and content similarities in the messages, enabling first responders to resolve and identify the location and scope of multiple events. Entity Resolution also plays an important role in counter-terrorism, for example, in determining when two terrorist incidents are the same. This project will involve the research and development of Entity Resolution techniques to group incidents from the Internet Crime Complaint Center database (www.ic3.gov). REU summer interns will have an unparalleled opportunity to explore Entity Resolution algorithms and applications in the context of law enforcement and other domains.


Project #: DC2014-05

Mentor: Brian Ricks, bcr50 at scarletmail.rutgers.edu, CCICADA
Project Title:  Data-driven Social Pedestrian Behavior

Crowd simulations are a powerful tool for predicting behavior, designing buildings, and preparing for emergencies. While much is known about the individual behavior of people in a crowd, less is known about how people move when they are in groups. Such groups include families, coworkers, and friends—each of which have their own dynamics. This project takes seeks to create a data-driven model of social pedestrian behavior by using actual anonymous footage of pedestrians. Successful completion of this project will involve creative solutions to image processing, obstacle avoidance, and social behavior problems.

Requirements: A strong background in programming and image process. Participants will n build a real-time crowd simulation as part of this project.

Project #: DC2014-06

Mentor: Brian Ricks, bcr50 at scarletmail.rutgers.edu, CCICADA
Project Title:  Modelling Panic Behavior

As the number of active shooter events continues to rise, many agencies are anxious to help protect citizens in a variety of setting. One way to prepare for active shooter events is through the creation of accurate, agent-based models of crowd behavior in a disaster scenario. Such models could be used to simulate various law enforcement tactics, building designs, and optimal crowd behaviors. Unfortunately, a highly accurate model does not exist. In this project we seek to build a high-fidelity model based on actual data from previous active shooter events. Using this model we will create real-time simulations of panic behavior to test various theories and proposals about active shooter prevention and mitigation.

Requirements: A strong background in programming. Participants will build a real-time crowd simulation as part of this project.

Project #: DC2014-07

Mentor: Brian Nakamura, bnakamura at gmail.com, CCICADA
Project Title:  Research in Enumerative/Algebraic Combinatorics

Enumerative combinatorics is an area of combinatorics that deals with the number of ways certain patterns can be formed. Most students are familiar with two simple examples of enumerative combinatorics: counting combinations and counting permutations. Algebraic combinatorics employs methods of abstract algebra, such as group theory and representation theory, within combinatorial contexts, as well as applying combinatorial techniques to algebra. Dr. Nakamura is a leading researcher with the Command Control Interoperability Center for Advanced Data Analysis (CCICADA) who research interests include enumerative and algebraic combinatorics, both from a theoretical and applied perspective. Dr. Nakamura will work with a qualified REU student to identify a new problem in enumerative/algebraic combinatorics that will be of interest to both the student and the mentor. This project will appeal to those students interested and applying algorithmic and computational techniques in a research setting.


Project #: DC2014-08

Mentor: James Abello, abello at dimacs dot rutgers dot edu
Project Title: Measuring the Mood of the Nation via Twitter

This project will develop methods to gauge public opinion on topics of current interest using Twitter messages and sentiment analysis algorithms [1]. Tweets matching selected topics will be collected and analyzed [2], and the end result will be visualized on a map using geo-tags contained in the Tweets. Topics could be related to politics (“Obama”, “Romney”), public policy (“deficit reduction”, “alternative energy”), or scientific exploration (“Mars rover”). Such tools help understand the mood in different parts of the US, and could also help understand shifts in sentiment over time.

References:

Turney P, “Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews,” Proceedings of the Association for Computational Linguistics (2002), pp. 417–424.
Gansner E, Hu Y, and North S, “Visualizing Streaming Text Data with Dynamic Maps, to appear in Proceedings of the 20th International Symposium on Graph Drawing” (best paper award), (2012), TwitterScope website.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-09

Mentor: James Abello, abello at dimacs dot rutgers dot edu
Project Title: Visualization of Time Varying Graphs

ASK-GraphView is a node-link-based graph visualization system that allows clustering and interactive navigation of large graphs [1]. The system constructs a hierarchy on an arbitrary, weighted, undirected input graph using a series of increasingly sophisticated clustering algorithms. Building on this earlier work, this project will address the problem of visualizing graphs that are being collected in a streaming fashion. A key challenge is to identify graph sub-structures and statistics on them that can be used as the basis to depict evolution of the graph through time [2]. Another goal is to devise a visualization algorithm that produces a layout that is stable over time, thus preserving a user’s mental map.

References:

Abello J, Ham FV, Krishnan N, “AskGraphView: A Large Graph Visualization System”, IEEE Transactions in Visualization and Computer Graphics, 12(5) (2006), 669-676.
Hu Y, Kobourov S, and Veeramoni S, “Embedding, Clustering and Coloring for Dynamic Maps, Proceedings of IEEE Pacific Visualization Symposium, (2012).

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-10

Mentor: James Abello, abello at dimacs dot rutgers dot edu
Project Title: Detecting anomalies in Internet traffic

This project will investigate algorithms for detecting anomalies in Internet traffic. Machine learning algorithms, in particular, implicit collaborative filtering [1], will be used to model normal Internet traffic records. Using this model, each machine is represented as a point in high dimension. Low-dimensional embedding algorithms [2] are then used to project points to 2 or 3D for visualization. Anomalies can be detected by visual inspection, as well as identified using clustering algorithms.

References:

Hu YF, Koren Y, Volinsky C, “Collaborative Filtering for Implicit Feedback Datasets,” IEEE International Conference on Data Mining (2008).
Borg I and Groenen P, “Modern Multidimensional Scaling: theory and applications,” New York: Springer-Verlag (1997).

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-11

Mentor: Wilma Olson, Wilma dot olson at rutgers dot edu, Chemistry & Chemical Biology
Project Title: Effects of architectural and regulatory proteins on the spatial organization and expression of bacterial genes

Many genetic processes are controlled by proteins that bind at separate, often widely spaced, sites on DNA and hold the intervening double helix in a loop. Our group has initiated a series of computer simulations of the formation of the DNA loops implicated in the Escherichia coli lac operon, the classic textbook example of genetic action at a distance [1]. We incorporate effects of naturally occurring species found in the cell, such as the non-specific architectural protein HU [2] and the Lac repressor protein, on the looping free energies. By representing the DNA as a sequence of base pairs, we capture the structural details of protein-bound sites and the natural deformability of free DNA [3-5]. This approach allows for departures from conventional treatments of DNA, such as ideal elastic models, which may be divorced from reality, and leads to results that can be directly linked to genetic processing at the molecular level. We relate the computed looping propensities to DNA looping propensitieis derived from gene repression and single-molecule studies [6,7]. We are also collecting the simulated structures in a database of three-dimensional loop models that can be used as starting points in the determination of structures consistent with NMR and other local spectroscopic measurements and subsequently refined with these high-resolution data. The approach is general and can be applied to any known protein-mediated DNA looping systems as well as to the design of novel DNA constructs, such as the three-dimensional origami formed by looping DNA between carefully spaced four-way junctions [8].

References:

Müller-Hill B (1996) The lac Operon. Walter de Gruyter, Berlin.
Swinger KK, Lemberg KM, Zhang Y & Rice PA (2003) Flexible DNA bending in HU-DNA cocrystal structures. EMBO J 22, 3749-3760.
Olson WK, Gorin AA, Lu X-J, Hock LM & Zhurkin VB (1998) DNA sequence-dependent deformability deduced from protein-DNA crystal complexes. Proc Natl Acad Sci USA 95, 11163-11168.
Swigon D, Coleman, BD & Olson WK (2006) Modeling the Lac repressor-operator assembly: the influence of DNA looping on Lac repressor conformation. Proc Natl Acad Sci USA 103, 9879-9884.
Czapla L, Swigon D & Olson WK (2008) Effects of the nucleoid protein HU on the structure, flexibility, and ring-closure properties of DNA deduced from Monte-Carlo simulations. J Mol Biol 382, 353-370.
Becker NA, Kahn JD & Maher 3rd LJ (2005) Bacterial repression loops require enhanced DNA flexibility. J Mol Biol 349, 716-730.
Han L, Garcia HG, Blumberg S, Towles KB, Beausang JF, et al. (2009) Concentration and length dependence of DNA looping in transcriptional regulation. PLoS ONE 4, e5621.
Han D, Pal S, Nangreave J, Deng Z, Liu Y & Yan H (2011) DNA origami with complex curvatures in three-dimensional space. Science 332, 342-346.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-12

Mentor: Bahman Kalantari, kalantar at cs dot rutgers dot edu, Computer Science
Project Title: Polynomiography

Dr. Bahman Kalantari is a leading researcher with the Department of Computer Science who was instrumental in developing the field of polynomiography. Polynomiography is a mathematically-inspired computer medium based on algorithmic visualizations of one of the most basic and fundamental tasks in science and mathematics: solving a polynomial equation. Polynomiography serves as a powerful medium for creativity, discovery, and learning, with numerous applications in education, math, science, art and design. Dr. Kalantari will work with a qualified REU student to identify a new problem from polynomiography that will be of interest to both the student and the mentor. In particular, some specific problems will be described from computational geometry, numerical analysis, discrete math, education, algorithmic mathematical art, fine art, and games.
http://www.cs.rutgers.edu/~kalantar/

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-13

Mentor: Bahman Kalantari, kalantar at cs dot rutgers dot edu, Computer Science
Project Title: The Triangle Algorithm and Applications in Linear Programming, Combinatorial Optimization, and Convex Programming

The Triangle Algorithm is a newly developed algorithm for testing if the convex hull of a finite set of points contains a distinguished point. It has potential applications in computational geometry, linear programming, convex programming and combinatorial optimization. It also gives rise to new iterative methods for solving a linear system of equations. The goal of this research is to explore some of these possibilities both from the computational points of view as well as theoretical ones. Related articles can be found online: http://www.cs.rutgers.edu/~kalantar/

References:

A Characterization Theorem and An Algorithm for A Convex Hull Problem, B. Kalantari http://arxiv.org/pdf/1204.1873.pdf
Solving Linear System of Equations Via A Convex Hull Algorithm, B. Kalantari http://arxiv.org/abs/1210.785
Finding a Lost Treasure in Convex Hull of Points From Known Distances, B. Kalantari http://2012.cccg.ca/papers/paper20.pdf

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-14

Mentor: Eugene Fiorini, fiorini at dimacs dot rutgers dot edu, DIMACS
Project Title: Problems in (Competition) Graph Theory

Dr. Fiorini is the associate director and research professor with the Center for Discrete Mathematics and Theoretical Computer Science who works in the areas of graph theory, combinatorics, and extremal graph theory. Dr. Fiorini will work with a qualified REU student to identify a new problem on graph theory related to an area of graph theory called competition graphs. The notion of competition graphs were introduced by the mathematician Joel Cohen in 1968 when he asked: “Can we assign to each species an ecological niche so that competition between two species corresponds to overlap of their ecological niches?” The research problem in competition graphs will be of interest to both the student and the mentor. This project will appeal to those students who are interested in graph theory and combinatorics.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-15

Mentor: Kevin Chen, kcchen at dls dot rutgers dot edu, Biology
Project Title: Designing new machine learning methods using techniques from linear algebra and multivariable calculus

We are developing novel computational techniques for inferring parameters of probabilistic models for sequential data, such as time-series or biological sequence data. The specific summer project is to show, either theoretically through formal proofs or empirically by writing simple computer programs, whether certain functions of the data are sufficient to recover the parameters of the model. Our specific biologic application is to segment the human genome into distinct domains based on epigenetic marks. This project is most suitable for student who is proficient in linear algebra and multivariable calculus and has basic programming ability (e.g. in Matlab). An interest in statistics / machine learning or biology would be very welcome but not necessary to do the project.

References:

1) Section 4 in Identifiability and unmixing of latent parse trees. Daniel Hsu, Sham M. Kakade, and Percy Liang. NIPS 2012.
2) cs.stanford.edu/~pliang/papers/identifiability-nips2012-poster.pdf

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-16

Mentor: Frank Hsu, hsu at cis dot fordham dot edu, Computer Science
Project Title: Information Fusion Paradigm, Combinatorial Fusion Analysis (CFA)

Dr. Hsu holds the Clavius Distinguished Professor of Science Chair and is the director of the Laboratory of Informatics and Data Mining at Fordham University. He is currently a visiting research professor with the Center for Discrete Mathematics and Theoretical Computer Science and has interests in the areas of combinatorics, algorithms, and optimization. An information fusion paradigm, combinatorial fusion analysis (CFA) he and colleagues developed, has applications in several domains including information retrieval, target tracking, cyber security, virtual screening, DNA sequencing informatics, bioinformatics, cognitive neuroscience, and portfolio management. Dr. Hsu will work with a qualified REU student to identify a new problem in an area related to his research interests. The research problem will be of interest to both the student and the mentor.


Project #: DC2014-17

Mentor: Robert Vanderbei, rvdb at princeton dot edu, Operations Research and Financial Engineering, Princeton
Project Title: New Stable Periodic Solutions to the n-Body Problem

Dr. Vanderbei is a distinguished researcher and professor with the Department of Operations Research and Financial Engineering at Princeton University. His current research interests include topics in applied mathematics, astrophysics, computer science, and statistics. He is currently working on a problem in applied and computational mathematics investigating mass ratios that lead to stable ring systems versus unstable ones. He would be happy to advise a summer REU student who is interested in researching new stable periodic solutions to the n-body problem. Dr. Vanderbei will work with a qualified REU student to identify a new problem in an area related to his research interests. The research problem will be of interest to both the student and the mentor.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: DC2014-18

Mentor: Janne Lindqvist, winlab dot rutgers dot edu, WINLAB
Project Title: Security and Privacy or Social Computing on Mobile Phones

This project will consist of a timely research thrust on either security and privacy or social comuting on mobile phones. The student will have access to one of the latest Android mobile phones and the actual project will be developed together depending on interests.

Requirements: Programming experience, perferably Java, Android programming is a definite plus, experience with Eclipse is a plus, but not required.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.


Project #: DC2014-19

Mentor: Christie Nelson, christie dot l dot nelson dot phd at gmail dot com, CCICADA
Project Title: Modeling Social Media for Diaster Situations

Social media has many emerging outlets of communication that did not exist even ten years ago, including Twitter, Facebook, micro-blogs, etc. Social media and text messaging can be utilized for quick communication when time is of the essence, particularly during disasters. During a disaster type situation, could the way that social media and text messaging communication is currently done be improved? In particular, on the public side, is there a way to improve the wording of messages to aid in the automation of classification? Should a supervised or unsupervised approach be taken? A supervised approach uses a labeled training set to build a model, while an unsupervised approach does not. A project of joint interest to the mentor and to the student will be identified. Some background in programming is encouraged.

* You must be a U.S. Citizen or Permanent Resident to be eligible for this project.

Project #: RUMath2014-1

Mentor: Christopher Woodward, woodwardc at gmail dot com, Mathematics
Project Title: The Yang-Mills equation

The Yang-Mills equation is an equation for a "connection" which comes up in physics; a special case is Maxwell's equations for electromagnetic fields. The Yang-Mills "heat flow" is a method for constructing solutions to the Yang-Mills equations. This project will focus on variations on the problem of the Yang-Mills heat flow, such as on spaces with cylindrical ends, with fields etc. Interested students should have at least some analysis and an interest in physics.


DIMACS Home Page
DIMACS Contact Information