DIMACS
DIMACS REU 2017

General Information

me
Student: Simon Bird
Office: CoRE 434
School: Rutgers University
E-mail: simon.bird@rutgers.edu
Project: Pan-Cancer Precision-Oncology Analysis of Tumor Mutational Signatures

Project Description

The primary goal of this project is to create a model and partner program in order to statistically analyze and predict whether a mutation in a tumor sample is hereditary


Weekly Log

Week 1:
This first week was the orientation. I learned the details of my project, then attended a conference on math-based cancer research at the Institute for Advanced Study
Week 2:
The second week, I spent primarily familiarizing myself with Matlab, and with the starter code my mentor had given me. I also reviewed my cell biology and cancer mechanisms, to better understand what was happening in the model
Week 3:
Spent time fixing bugs and writing my own partner programs to further develop the model
Week 4:
Completed the main program this week. At the beginning of the project, we were quickly able to analyze the data if we knew the exact ploidy of the tumor cell. However, this often does not happen. So this week, I finished the expansion to analyze a range of ploidys by specifying a "max ploidy". This still is not perfect, since the user needs to have a ballpark estimate of the true value. Obviously the analysis will not work if the true value is outside of the range. However, it will be much more useful than our initial program. Additionally, there is a web version of the original program (exact ploidy, not max ploidy) available at http://www.khiabanian-lab.org/pages/lohgic.html
Week 5:
The way the analysis program generally works is by computing a series of AIC values, then weighting so they can be compared between models. However, this still does not provide insight as to whether we can draw conclusions from the results. In many scenarios, this would actually be impossible. However, this is very important the rest of the time. So, I created a program that created a theoretical tumor, took binomial samples from that tumor, then analyzed them with the original program. The idea was that if we performed that numerous times for over a wide array of features, we would get a theoretical weight for the true value. For example, let us look at a particular sample where models 1 and 2 both have weights of .3. What conclusions can we draw from this? Using the information from before, we can have a scenario where model 1 usually has a weight of .6 when it is the true model, and model 2 has a weight of .3 when it is the true model. Clearly, it is much more likely that model 2 is our true model. This test can be evaluated more precisely using the standard deviation of the weight. If a sample rarely devates from a particular weight when it is the true value, we can discount it in some scenarios, even if it has a high weight. Implementing this more precisely is the goal over the next week.

Presentations


Additional Information