Proposed projects for the 1997 DIMACS REU Program

Supervisors and Suggested Problems

Faculty Supervisors

The following faculty have expressed interest in participating in the program in 1997-98. (Other faculty and project areas may be added later.) The suggested area of research appears in parentheses. (Note: RUTCOR is the Rutgers Center for Operations Research.)

Suggested Projects

The following topics have been suggested by REU research supervisors, and offer a range of both applied and theoretical problems. The topics are of current interest to Computer Scientists and Mathematicians, but can be understood in a short period of time by a well-trained undergraduate. Supervisors may also be willing to work on other topics, depending on the student's and their own interests.

* (Project 1) Decision-Making Under Uncertainty (Adi Ben-Israel)

In business, or in personal life, we are often required to make decisions without having all the relevant information. Such information (e.g. the local weather a week hence, or next-quarter earnings of IBM) may become known only after the decision.

This project is a study of situations which can be modeled as multi-stage decision processes, with uncertainty represented by random variables with known distributions. At each stage a decision is made, then a random variable is sampled, and the return (to the decision maker) depends on the decisions and observed values at all stages. Examples are in finance (portfolio selection), insurance (optimal coverage), production under price uncertainty (e.g. agriculture), inventory control (optimal levels) and artificial intelligence.

A good background for this project includes knowledge of probability and linear algebra, and computational experience. Some exposure to optimization methods such as linear or dynamic programming, as well as experience using a language such as LISP or PROLOG would also be useful.

* (Project 2) Iterative Methods for Solving Nonlinear Equations (Adi Ben-Israel)

Nonlinear equations in one variable, f(x)=0, or systems of such equations in several variables, are solved numerically by iterative methods. Each iteration typically begins with an approximate solution, produces a "better" approximation, or decides to stop (because the current approximation is good enough, or it cannot be improved by the algorithm).

This project studies some well-known iterative methods, including the Newton and Halley methods. Topics include:

  1. Iterative methods obtained by applying Newton & Halley to modified functions g(x):=h(x)f(x), or transformed variables g(u):=f(h(u)), for suitable functions h designed to improve convergence.
  2. Chaotic phenomena associated with iterative methods (see e.g. the project webpage Iterative Methods and Complex Dynamics).
  3. The order-of-convergence of the quasi-Halley method (proved to be at least 2.41, and observed to be close to 3).
Knowledge of advanced calculus, complex variables, numerical analysis and programming is desireable. The project involves numerical experiments, using MATLAB or symbolic software (MATHEMATICA, MAPLE, etc) which the student should be ready to learn in short order.

* Geometry and Protein-Folding (Scott Decatur)

An important and challenging problem in molecular biology is the task of determining the native three-dimensional structure of a protein when only given the sequence of amino acid residues which compose the protein chain. Due to the complexity of the protein folding problem, scientists have studied a variety of simplifications of the general problem. Dill (1985) introduced one such model, called the Hydrophobic-Polar (HP) Model.

The HP model abstracts the problem by first grouping the 20 amino acids which compose proteins into two classes: hydrophobic (or non-polar) residues and hydrophilic (or polar) residues. Three-dimensional space is commonly discretized in this model by embedding the protein in a regular lattice. The "native" fold for the protein is said to be the one which has largest number of pairwise contacts between hydrophobic residues. The biological foundation of this model is the belief that the first-order driving force of protein folding is due to a "hydrophobic collapse" in which those residues which prefer to be shielded from water (hydrophobic residues) are driven to the core of the protein, while those which interact more favorably with water (polar residues) remain on the outside of the protein.

This project will involve: (1) Approximation Algorithms (more theoretical than experimental): development of efficient algorithms which fold proteins such that the number of contacts is provably close to the protein's optimal number of contacts; AND/OR (2) Energy Surface Exploration (more experimental than theoretical): development of algorithms which examine the space of possible folds and the energy (i.e. number of contacts) of the folds in order to elucidate the pathways of protein folding.

* Mobile Robot Navigation (Sven Dickenson)

A key problem in mobile robotics is to find a way to program a mobile robot (equipped with a camera) to visually navigate its environment, i.e., visually determine its position (coordinates). A promising approach is to have the robot first determine the optimal set of visual landmarks for navigation, and then use these landmarks to compute its position. The problem of finding the optimal set of landmarks can be formulated as a graph coloring problem with special constraints introduced by the nature of the problem. The goal of this project is to design, implement, and/or test algorithms to solve this problem.

In order to test both landmark acquisition and navigation algorithms, an environment simulator is needed in which indoor environments with different shapes, obstacles, and landmarks can be created. The student's role in this project will be to design and implement (in software) such an indoor environment simulator. This will not only include exposure to computational geometry algorithms, but to computer graphics techniques as well. Once the simulator is complete, the student will test various landmark acquisition algorithms using the simulator, and, if time permits, promising approaches will be tested on a mobile robot operating in a mock-up nuclear reactor in Canada.

Students with an interest in graph theory and graph algorithms are encouraged to apply. A background in algorithm design and programming experience in C would be be very useful. In addition, the design of the simulator offers interesting problems in computer graphics.

* Protein Structure Prediction (Mona Singh)

An important problem in computational molecular biology is that of predicting the three dimensional structure of a protein given only its one dimensional amino acid sequence. This problem is of practical importance since the biological function of a protein depends upon its three dimensional structure. Although the general problem remains unsolved, computational methods can sometimes provide a first step in protein structure determination.

This project will approach the protein structure prediction problem using sequence-structure alignments. A sequence-structure alignment, also referred to as a threading, is sometimes used to predict the overall structure of a protein by aligning protein sequences onto 3D structures, using a pairwise energy potential. Although the general threading problem is NP-complete, in practice this approach is one of the more successful methods for 3D structure prediction. This project will be an experimental and/or theoretical study of algorithms for a simplified version of this problem.

Students should have a background in algorithms and programming experience in C.

* DNA Sequences and Three-dimensional Structure (Wilma Olson)

[Students interested in this topic may wish to consider attending the Workshop on DNA Topology II]
The DNA base sequence, once thought to be interesting only as a carrier of the genetic blueprint, is now recognized as playing a structural role in modulating the biological activity of genes. Through subtle variations of bending and twisting at the level of neighboring base pairs, the chemical sequence not only generates a higher-order folding of the double helix itself but also produces structural motifs that facilitate the specific binding of proteins and other ligands. The consequent looping and twisting of the DNA assumes an important role in various biological processes.

To understand how the genetic code is translated into three-dimensional structures used in biological processes, our group is analyzing the known geometry of DNA bases in chemical structures and developing mathematical models to incorporate the sequence-dependent bending, twisting, and translation of known fragments in DNA molecules of varying lengths and chemical composition.

* Computational Group Theory (Charles Sims)

Symbolic computation packages, such as Maple, Mathematica, or GAP, can be used to gain insight into significant open problems in group theory. Many of these problems are accessible to an undergraduate with a good background in algebra. For example, a classic problem of Burnside is to determine whether any finitely-generated group with finite exponent n (x^n = 1 for all x in the group) is necessarily finite. This is known to be true when n = 1, 2, 3, 4, or 6, and false for large n. Thus the smallest open case is n = 5. Two possible projects related to Burnside's problem are:

(1) A study of the structure of the largest finite group R which has exponent 5 and is generated by two elements. (R is known to have order 5^34.)

(2) A comparison of groups of exponent 5 with groups of exponent 6 in an effort to understand better why the Burnside problem with exponent 6 is easier than with exponent 5.

* Algorithms for Simulations of Traffic in Wireless Networks (Andrew Ogielski)

Mobile computing, with portable computers connected to the Internet via a wireless network wherever the users happen to be, will enable people to communicate and work in many radically new ways. Unlike the wired data networks, wireless networks exhibit high error rates and interruptions of transmission due to the physics of radio propagation. Such behavior of the transmission channels interacts in unexpected ways with network protocols and applications.

Due to high complexity of communications networks, studies of their behavior rely on extensive simulations. Good simulations require good approximate models of packet traffic. Many recent measurements showed that packet traffic has the characteristics of very bursty, self-similar (fractal) stochastic processes, for all kinds of data networks. This is very different from the well-studied voice traffic in the telephone networks.

This project will involve reviewing recent research literature on algorithmic techniques for generating bursty stochastic processes with long range dependence, formulating a traffic source model, and eventually implementing it in a C/C++ based description framework for parallel simulation. Basic knowledge of probability theory and stochastic processes is required. Some knowledge of data communications would be helpful, but is not required. Working knowledge of C or C++ is needed for model implementation, but the project will be successful if it delivers a well designed pseudocode instead of a running program.

* Computational Geometry Applied to Wireless Communications (Diane Souvaine)

The goal of our project is to design and build a system for radio wave propagation prediction. At this stage in the projects development, we have a crude simulator that performs radio wave propagation prediction using only the simplest of modeling methods (ie., free-space, manhattan distance) and a query module that can be used to track and provide information on the simulated mobile users. A few more prediction methods will be added before the start of summer so that we will be ready for the addition of a very important component, a ray-tracing system for radio wave propagation modeling and signal strength prediction. This is the project we wish to have an REU work with us upon.

At the beginning of the project we will need to research and discuss the various methods for performing fast ray-tracing, the data-structures involved, previous work in this area (like Fortune's and Rappaport's ray-tracing modeling systems for radio wave propagation prediction) and the potential for improving said methods and designs. Next we will need to design the system based upon our conclusions. Finally, the chosen algorithms will need to be coded, tested and verified.

The above paragraph only discusses the computational geometry aspect to this problem. There is the second issue which concerns how radio waves should be modeled in this domain. We will be relying upon domain experts and the literature for answers to most of these questions but the fact remains that we will have to mold the ray-tracing system to fit the needs of this radio wave propagation model. Where shortcuts can or should be taken, what information about the terrain will be necessary and what latest research can help improve this model, are all interesting questions. Some have been answered but many still remain.

Index Index of REU Program
DIMACS Home Page
Contacting the Center
Document last modified on October 29, 1998.