![]() Proposed projects for the 2013 DIMACS and DIMACS/DIMATIA REU ProgramsPlease revisit this page frequently, additional projects will be posted through January. Project #: DC2013-01Mentor: Alexei Vazquez, vazquez at ias dot edu; CINJ Project Title: Optimization of Personalized Therapy for Cancer Treatment Years of experience in cancer therapy have taught us that using combinations of drugs for personalized therapy leads to better response rates [1,2]. However, designing personalized therapies is challenging because of the many drugs for cancer treatment and the many tumor properties that can inform treatment [2]. We formulate personalized combinatorial therapy as a constrained optimization problem, where markers (tumor properties) and treatment decision rules are assigned to drugs to achieve a high treatment response rate while using drug combinations of minimal size. The goal of this project is to design heuristic algorithms to solve this optimization problem. Requirements: Background in Statistics and SQL is desired. Participants are expected to develop scripts to statistically analyze a variety of data sets. References:
2. A. Vazquez, "Optimal Drug Combinations and Minimal Hitting Sets," BMC Syst Biol., 3(81)(2009). * You must be a enrolled at a U.S. university to be eligible for this project. Project #: DC2013-02Mentor: Janne Lindqvist, janne at winlab dot rutgers dot edu; WINLAB Project Title: Security and Privacy or Social Computing on Mobile Phones This project will consist of a timely research thrust on either security & privacy or social computing on mobile phones. The student will have access to one of the latest Android mobile phones and the actual project will be developed together depending on interests. Requirements: Programming experience, preferably Java, Android programming is a definite plus, experience with Eclipse is a plus, but not required. * You must be a enrolled at a U.S.
university to be eligible for this project. Project #: DC2013-03Mentor: Robert DeMarco, rvdemarco at gmail dot com; CCICADA Project
Title: Problems in (Random) Graph Theory * You must be a enrolled at a U.S.
university to be eligible for this project. Project #: DC2013-04
Mentor: Kevin Chen, kcchen at Biology dot rutgers dot edu; Biology
Project Title: Combinatorial algorithms for genome assembly Genome assembly is the problem of reconstructing a full linear genome sequence (up to 3 billion bases long) from short (~50-100 bases), overlapping sequences called “reads” sampled from the genome. This is a classic problem in bioinformatics whose importance has escalated with the emergence of next generation sequencing technologies that have dramatically reduced the cost of DNA sequencing. In most genome assembly algorithms, the fundamental mathematical object is the de Bruijn graph, which represents all the sequence overlaps between the reads. The genome sequence is then computed as an Eulerian path through the de Bruijn graph [1]. The project will be to efficiently implement a basic de Bruijn graph-based assembler along with subroutines for sequencing error correction and graph simplification.References: 1. Pevzner PA, Tang H, and Waterman MS, “An Eulerian Path Approach to DNA Fragment Assembly,” Proc Natl Acad Sci USA 98(17) (2001), 9748-9753. * You must be a enrolled at a U.S. university to be eligible for this project. Project #: DC2013-05 Mentor: James Abello, abello at dimacs dot rutgers dot edu; DIMACS
This project will develop methods to gauge public opinion on topics of current interest using Twitter messages and sentiment analysis algorithms [1]. Tweets matching selected topics will be collected and analyzed [2], and the end result will be visualized on a map using geo-tags contained in the Tweets. Topics could be related to politics (“Obama”, “Romney”), public policy (“deficit reduction”, “alternative energy”), or scientific exploration (“Mars rover”). Such tools help understand the mood in different parts of the US, and could also help understand shifts in sentiment over time. References: 1. Turney P, “Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews,” Proceedings of the Association for Computational Linguistics (2002), pp. 417–424.
* You must be a enrolled at a U.S. university to be eligible for this project. Project #: DC2013-06
1. Abello J, Ham
FV, Krishnan N, “AskGraphView: A Large Graph
Visualization System”, IEEE
Transactions in Visualization and Computer
Graphics, 12(5) (2006), 669-676.
Project #: DC2013-07
1. Hu YF, Koren Y, Volinsky C, “Collaborative
Filtering for Implicit Feedback Datasets,” IEEE
International Conference on Data Mining (2008).
1. Hochstättler, W and Schliep, A, “CATBox:
An Interactive Course in Combinatorial Optimization”,
Springer Verlag 2009.
Project #: DC2013-09
Project Title: Effects of architectural and
regulatory proteins on the spatial organization and
expression of bacterial genes Many
genetic processes are controlled by proteins that bind
at separate, often widely spaced, sites on DNA and
hold the intervening double helix in a loop. Our group
has initiated a series of computer simulations of the
formation of the DNA loops implicated in the
Escherichia coli lac operon, the classic textbook
example of genetic action at a distance [1]. We
incorporate effects of naturally occurring species
found in the cell, such as the non-specific
architectural protein HU
[2] and the Lac repressor protein, on the
looping free energies. By representing the DNA as a
sequence of base pairs, we capture the structural
details of protein-bound sites and the natural
deformability of free DNA [3-5]. This approach allows
for departures from conventional treatments of DNA,
such as ideal elastic models, which may be divorced
from reality, and leads to results that can be
directly linked to genetic processing at the molecular
level. We relate the computed looping propensities to
DNA looping propensitieis derived from gene repression
and single-molecule studies [6,7]. We are also
collecting the simulated structures in a database of
three-dimensional loop models that can be used as
starting points in the determination of structures
consistent with NMR and other local spectroscopic
measurements and subsequently refined with these
high-resolution data. The approach is general and can
be applied to any known protein-mediated DNA looping
systems as well as to the design of novel DNA
constructs, such as the three-dimensional origami
formed by looping DNA between carefully spaced
four-way junctions [8]. References: 1. Müller-Hill B (1996) The lac
Operon. Walter de Gruyter, Berlin.
* You must be a
U.S. Citizen or Permanent Resident to be eligible
for this project.
In this research, Higher
Order Learning techniques have been investigated and
applied to a number of interesting datasets including
Border Gateway Protocol (cybersecurity), E-Commerce
(web mining), and Nuclear Detection (homeland
security). The approach has been shown to outperform
the popular Support Vector Machine algorithm on
benchmark textual data sets as well as real-world
streaming data [3]. REU summer interns will have an
unparalleled opportunity to explore Higher Order
Learning algorithms and applications in the context of
law enforcement and other domains. References: 1. Getoor L,
Friedman N, Koller D, and Taskar B, “Learning
Probabilistic Models of Link Structure,” Journal of
Machine Learning Rsearch (JMLR) (2002).
* You must be
a U.S. Citizen or Permanent Resident to be eligible
for this project.
Project #: DC2013-12
Project #: DC2013-13
References: 1. A
Characterization Theorem and An Algorithm for A Convex
Hull Problem, B. Kalantari http://arxiv.org/pdf/1204.1873.pdf
* You must be a
U.S. Citizen or Permanent Resident to be eligible
for this project. Project #: DC2013-14 Mentor:
Eugene Fiorini, fiorini at dimacs dot
rutgers dot edu; DIMACS
Dr. Fiorini is the associate director and research professor with the Center for Discrete Mathematics and Theoretical Computer Science who works in the areas of graph theory, combinatorics, and extremal graph theory. Dr. Fiorini will work with a qualified REU student to identify a new problem on graph theory related to an area of graph theory called competition graphs. The notion of competition graphs were introduced by the mathematician Joel Cohen in 1968 when he asked: “Can we assign to each species an ecological niche so that competition between two species corresponds to overlap of their ecological niches?” The research problem in competition graphs will be of interest to both the student and the mentor. This project will appeal to those students who are interested in graph theory and combinatorics. * You must be a
U.S. Citizen or Permanent Resident to be eligible
for this project. Project #: DC2013-15 Mentor: Robert Vanderbei, rvdb at Princeton dot edu; Operations Research and Financial Engineering, Princeton University
Project Title: Climate Change
Analysis Dr. Robert Vanderbei is a leading researcher in operations research and financial engineering. Dr. Vanderbei is also an associated faculty member with the Astrophysics, Computer Science, Mathematics, and Mechanical and Aerospace Engineering departments at Princeton Universtiy. He would be happy to advise a summer REU student who is interested in refining the climate change analyses described in a "weather module" that is being developed for the Mathematics of Planet Earth 2013 project. In particular, it would be interesting to make a simplified version of the world climate change map in which each location is analyzed not by solving a daily model with sinusoidal seasonal fluctuations but rather a simpler model that uses annual averages for the input to a linear regression model. * You must be a
U.S. Citizen or Permanent Resident to be eligible
for this project. Project #: DC2013-16 Mentor: Kevin Chen, kcchen at Biology dot rutgers dot edu; Biology
Identifying differentially expressed genes between samples is a biologically important problem. Recent next-generation sequencing technologies such as RNA-seq have been greatly utilized for differential expression analysis. From RNA-seq, overlapping gene transcript fragments (~50-100 nucleotides) are generated. By assembling the fragments and counting the number of the fragments for a certain gene, the gene transcript abundance can be estimated. Cufflinks, an algorithm that computes transcript abundance of gene isoforms from RNA-seq data, implements a proof of a combinatorial result called Dilworth's theorem [1]. The goal of this project will be to learn the mathematical model behind Cufflinks and apply it to non-coding RNAs, as opposed to protein coding genes. References: * You must be a
U.S. Citizen or Permanent Resident to be eligible
for this project. Project #: DC2013-17 Mentor: Kevin Chen, kcchen at Biology dot rutgers dot edu; Biology
MicroRNAs (miRNAs) are a special class of small non-coding RNAs that play an important role in gene regulation. They are distinguished from other small RNAs in that they are processed from stem-loop structures to form ~22nt long, single-stranded RNAs. MiRNAs can silence genes by either degrading messenger RNAs, or simply keeping them from being translated into protein. Computational methods for discovering miRNA motifs and predicting their target genes is of clear biological importance. This project will center around applying new statistical methods to the problem and comparing them to established methods based on linear regression [1]. A background in statistics, especially linear models, and programming ability is preferred. References: * You must be a
U.S. Citizen or Permanent Resident to be eligible
for this project. Project #: DC2013-18 Mentor: Lazaros Gallos, k lazaros dot gallos at Rutgers dot edu; Ecology, Evolution, and Natural Resources
Many successful books and book series describe long stories that include a multitude of characters. The evolution of a story creates a timeline of events where book characters interact with each other. This timeline can then be translated into a dynamically evolving complex network. The analysis of complex networks is a rapidly expanding field and is ideally suited to extract information from complex interactions. In this case, the comparative analysis of the character affiliation networks in different books can be used to detect possible patterns that are either characteristic of a genre, a time-period for the genre, etc. Similarities and differences in story-building of different authors within and across genres are of particular importance and may hint to different writing styles. The goal of the project is to introduce one or two students to the fundamental concepts of complex network theory and guide the students through the whole process of acquiring the data, apply the learned concepts to the data analysis, and decide on the important findings that should be highlighted in a research paper. This network-based project should demonstrate the power of applying complex network theory to a variety of disciplines. The list of books can include, among hundreds of others, the Lord of the Rings, the Harry Potter series, the Gunslinger series, War and Peace, Les Miserables, the Iliad, etc. Simple data-mining techniques should be used to extract the interactions from the electronic text of the books and reading these books for the project should not be necessary. Requirements: A good knowledge of programming (preferably in C/C++) is preferred and some experience in a Linux-based environment may also be helpful. * You must be a
U.S. Citizen or Permanent Resident to be eligible
for this project. Project #: DC2013-19 Mentor:
Yana Bromberg, yanab at rci dot Rutgers dot
edu, Biochemistry and Microbiology Project Title: Metagenome analysisOur genomes define us as unique representatives of a very unique species. Many organisms exist, however, that lack this sense of self-identity. In fact, they are often unable to live without their environmental-niche "neighbors" . Taken together, the micro-organism population of a given niche is termed the microbiome. The DNA sequence of all microbiome members is the metagenome. Interpreting the molecular functions encoded in a given metagenome can help in biofuels, biopesticides, and antibiotic production. In this project we will try to identify the specific molecular functions and biological processes that take place within a given microbiome by looking at its metagenomic data. We will also try to define the minimal set of functions necessary for a given microbiome's existence in the specific environmental niche it occupies. * You must
be a U.S. Citizen or Permanent Resident to be
eligible for this project.
Project #: DC2013-20 Mentor:
Nina Feferman, lazaros dot gallos at Rutgers
dot edu, Ecology, Evolution, and Natural Resources Assistant
Graduate Mentor: Orin
Robinson Project Title: Game Theory of Mate SelectionThis project will use both agent based simulation and game theory to determine threshold effects in evolutionary fitness for individuals hoping to attract the best possible mate while hoping to avoid attracting competitors for those mates. Some previous experience in programming will be helpful, but not necessary. * You must be a U.S. Citizen or Permanent Resident to be eligible for this project.
|