DIMACS
DIMACS REU 2015

General Information

me
Student: Daniel O'Connell
Office: 446, CoRE
School: University of Connecticut
E-mail: daniel (dot) a (dot) o'connell (at) uconn (dot) edu
Project: Whole Transcriptome Sequencing and Analyzing the Immune Tumor Microenvironment

Project Description

My project consists of using computer programs to analyze transcriptome data from samples taken out of tumors during a clinical trial. I am specifically interested in determining if a detectable immune response was induced in the tumors, but there is much information that can be found in this type if information and the research could go in many different directions depending on what initial information is found. I will also be getting some exposure to the biological lab work that goes into this type of treatment.


Weekly Log

Week 1:
This week consisted mostly of reading papers and a meeting with my advisors to gain background information about the project. This was mainly in preparation for the presentation I had to give at the end of the week. There were also many orientation type activities I participated in.
Week 2:
This week there was more reading and meeting with my advisors. I was told about the cibersort software developed at Stanford, and spent time familiarizing myself with this so I could use it on out data. On Tuesday there was a workshop on LaTeX and html where I learned how to set up this website. Also on Wednesday there was guest tutorial on complexity and conditional lower bounds.
Week 3:
This week consisted of running Cibersort and discussing the results with Dr. Lattime, as well as starting to work with Gene Set Enrichment Analysis software. This is another technique to examine the expression of genes but this looks at predefined sets of genes and not just individual genes, which is useful because while the expression of no one gene in a pathway may be changed significantly, the entire pathway could have changed expression levels together. I specifically ran analysis on the gene sets that are for the human immune system. There was a problem with the format of our data, this program is designed for transcriptome data collected from a microarray and ours was collected through RNA-seq, but I am looking for a way to use this software to still get reliable results. We also had a rather lengthy but enjoyable tutorial on Math Modeling of Crowd Dynamics by Dr. Benedetto Piccoli.
Week 4:
During this week, there was a lot of discussion about how to best run GSEA. The website recommended using the Preranked version, where the input is a list of genes with some sort of metric associated to show how much the expression levels changed. There were many different tests tried out, discussions about how much to weigh or ignore outliers, but we eventually settled on using the log2 of the fold change. There was also a speaker, David Molnar, who talked about connection Games and Sperners Lemma.
Week 5:
This week started with looking at the results of the GSEA on the data from the pancreas tumors. Then I was given new data, from tumors of people treated with the drug sorafenib, and compared the gene expression levels of people who did not respond well to the drug to people who had good responses to the drug. There were also 4th of July activities going on.
Week 6:
Sorafenib is a kinase inhibitor so I used GSEA to look at the biological pathways that are associated with those kinases. There were interesting differences in the activities of these pathways in the good responders and the non-responders, specifically in the immune pathways, so I ran the data through cibersort and also saw some interesting trends in a few cell populations. There was also a talk by Dr. Neil Sloane on his On-Line Encyclopedia of Integer Sequences and a discussion about applying to grad school.
Week 7:
This week I had a meeting with the researcher, who conducted the sorafenib study, so I did a lot of verifying my results and making sure that I had everything in a presentable form and that I knew the results inside and out. We discussed what I had found, what he was looking for as possible future questions and what he would like me to do. One thing that I started on was looking online for publically available data sets from tumor samples of patients that have liver cancer and hepatitis B to see if the trends we saw were consistent in a larger population. We also had our second presentations on Friday so I had to prepare for those.
Week 8
This week I had to wrap up most of my work. I finished finding the data available online and looking for the trends that we saw in our data. There were problems with this though because comparing different data from different places is always difficult. They follow different procedures and use different tools so there were some predictable complications but nothing that we could not fix with some data normalization and other tricks. I also had to do some things for the REU like actually updating my website.
Week 9
This is the last week of the program and the main thing that I did was write the final report that we need to hand in for the program to get funding in the future. We had a talk about statistics and biostatistics and how they can be applied to problems in genetics and medicine. There were also final goodbye meetings with my advisors and saying goodbye to all of my new friends. I am happy to go home and see my family but I am sad to see this program end. It was a great opportunity to meet and work with some amazing people and both the research and non-research parts were incredibly enjoyable.

Presentations


Additional Information