Sivasomasundari Arunarasu, DIMACS REU 2021

Name:	Sivasomasundari (Sundari) Arunarasu
Email:	sundari.arunarasu (at) emory.edu
Institution:	Emory University
Mentor:	Dr. Subhajyoti De
Project:	Genomic Data-Guided Mathematical Modeling of Cancer

About My Project

Weekly Summary

Week 1 (5/24-5/30)

This was the first week of the REU program, and so I attended the orientation meeting, where we learned more about DIMACS and the specifics of the program. The following day, our graduate mentor Parker held a meeting where we learned about creating web pages using HTML, and how to connect to the DIMACS server to update our personal pages. I also met with my mentor and a postdoc in his lab to discuss the specific project I would be working on during the program. Within the broader topic of tumor heterogeneity (the fact that cells within a tumor are different from each other), I will be focusing on analyzing lab data to determine how exactly these cells differ and how their phenotypes change over the course of tumor growth. I began initial data analysis in R with the dataset that my mentor sent me, and I created my introductory presentation about the research that I will be working on this summer. I also read several journal articles about intratumor heterogeneity, cell morphology, and characteristics of stem cells to help me understand more of the background behind Dr. De's research.

Week 2 (5/31-6/6)

This week started with giving my introductory presentation and learning about the other participants' projects, which I found very interesting. I continued to perform exploratory data analysis using R on the cell dataset collected from a lab experiment tracking the characteristics of tumor cells over time. I noted down some of the interesting correlations between variables, and discussed these with my mentor in our weekly meeting. The postdoc I am working with, Antara, also explained more details about the different types of cells found in tumors (holoclones, meloclones, and paraclones), and the general charateristics of each. This new information changed my view of the data analysis I had conducted thus far, so as I continued my work after the meeting, I had a better understanding of the trends or associations that I would expect to see. My mentor also introduced the topic of principal component analysis (PCA), which I started learning about, and will focus on next week as well, with the goal of implementing PCA on my dataset.

Week 3 (6/7-6/13)

I attended my first lab group meeting, where the members of Dr. De's lab discussed their research and the progress they had made with the group. I thought this was a very good way to learn about topics that are different from my own project, but still related under the broader theme of cancer heterogeneity and genomics. This week I continued with my PCA analysis in R, and identified certain variables from the data that seemed to be most important, based on their contribution to the first principal component (PC1). I also created a few figures to help better visualize the data, which I shared with my mentor and discussed in our weekly individual meeting. After sharing my findings and thoughts from the past week, Dr. De introduced me to another data analysis method, UMAP (Uniform Manifold Approximation and Projection), which could help to separate the data into groups based on similarities in different variables. I began learning about the UMAP method and how it works, and used R to apply UMAP to my data. I also explored different methods of visualizing data from the UMAP analysis, and how to determine feature importance from the UMAP results.

Week 4 (6/14-6/20)

I shared my findings so far with the lab group during the weekly meeting, and Dr. De advised me on further steps to continue with my analysis. I redid my UMAP analyses using all the variables in the dataset, and from there narrowed down which variables were the most important. From my final UMAP visualization, I was able to separate the cells into two groups, which hopefully correspond to holoclone and meroclone cells. When examining which features were most important in classifying the cells, I found that the radius and formFactor (a measure of shape irregularity) were two of the most prominent. Since holoclones are generally thought to be more circular in shape, due to their higher division rate, this suggests that the UMAP analysis was effective at grouping the cells. I started the next step of my analysis to determine how accurate the cell classification is, which I did/will do by looking at the experimental images and trying to determine which cells were more holoclone/meroclone-like with a human eye, and compare this grouping with that of the UMAP. My goal for the next week is to analyze more of the lab images and attempt to classify the cell colonies present.

Week 5 (6/21-6/27)

This week I worked on manually identifying the cell colonies in some of the lab experiment images, and comparing this classification to the percent of holoclone and meroclone cells present in each colony as determined by the UMAP model. I shared my analysis with the postdoc I am working with, to make sure that she agreed with my sorting of the colonies. I also spent some time searching for and reading relevant research articles, specifically trying to find more information on characteristics of holoclone cells and colonies, and how their morphology differs from that of meroclones. There was some, but not a substantial amount of existing research on this topic, which I suppose makes sense since the goal of this project is to discover new information, or solidify current findings, regarding intratumor heterogeneity, in particular phenotypic heterogeneity and clonal evolution. I could not find any articles that mentioned the proportion of holoclone or meroclone-like cells in a certain colony, so this could possibly be a significant factor that is useful in identifying cancer cell colonies and their evolution. This week my mentor also sent me the R code that can be used to estimate the cell state transition rate between holoclone and meroclone-like cells, so I spent time figuring out how this code worked.

Week 6 (6/28-7/4)

This week I continued with my manual analysis of the cell images, and also ran CellProfiler on three different sets of images from lab experiments. The CellProfiler program gives information about the characteristics of the cells in the images, such as mean/median radius, area, and several measures of shape (including eccentricity and FormFactor). This data was then used, along with the UMAP analysis, to classify the cells into two groups, after which I compared the proportion of each cell state present in each colony to my manual analysis of the colonies as more holoclone or meroclone like. This Thursday I also attended a very informative session about research papers and how they are structured, which will be useful in the coming weeks as I start to summarize my work and write up my findings formally in a paper.

Week 7 (7/5-7/11)

This week I created GIF videos of tumor cell growth over time for the three sets of images I was analyzing, with the cells classified as holoclone or meroclone like. These can help to visualize how the number, proportion, and clustering of the two cell types changes as the cell colonies grow. I also used the previously mentioned R code to determine the cell transition rates for each set of images. I experimented with different grid values (how many sections an image is divided into) and ranges of images to include in the analysis (since images near the very beginning might not have enough cells to be useful, and those near the end are likely too crowded). Some of the transition rate values were negative or were unexpected based on the cell type, so I will continue working with this next week to get the most accurate rates for each set of images. I also attended AI day this week, which introduced me to many new topics and definitely encouraged me to explore the field of AI further in the future. Finally, I began writing my research paper this week; in particular, I worked on the introduction section and started on the methods.

Week 8 (7/12-7/18)

This week I spent most of my time working on my final research paper. I went back through the work I had done and found the most releveant figures and results to include in my paper. This included remaking some graphs in R and creating new data tables as well. I also kept working with the cell state transition values, still trying to find the most accurate and reasonable values for each set of images. In writing my research paper, I read through some journal articles related to the general topic of tumor heterogeneity, and also looked more into the T24 cell line and its importance. Finally, I started to create my presentation for the end of the REU program. Next week, I will finish writing my paper and present the research I have worked on this summer to the rest of the group.

Week 9 (7/19-7/23)

This was the final week of the DIMACS program, so I spent most of time working on finishing my presentation and final paper. I collected references that I used throughout the summer, organized my results and figures, and summarized my work in the conclusion section of the paper. I was able to listen to watch the other DIMACS participants' presentations on Thursday and Friday, and present my own research on Friday. I found it very interesting to hear about the progress that all the participants had made over the summer, even if I could not fully understand the details of their research. Overall, I really enjoyed the DIMACS program and working with my mentor Dr. De this summer.

Acknowledgements

Thank you to:

DIMACS REU program
NSF grant CCF-1852215
My mentor: Dr. Subhajyoti De

References & Links

Presentations

Here is the REU website:

The REU Website

Sundari Arunarasu's REU 2021 Web Page

About Me