DIMACS
DIMACS REU 2018

General Information

me
Student: Christopher Espana
Office: CORE 419
School: Rutgers University
E-mail: ce187 (at) dls.rutgers.edu
Project: Statistical Research at the Cardiovascular Institute - Atrial Fibrillation and Risks of Pulmonary Embolisms

Project Description

Atrial Fibrilation (AF) is a heart condition characterized by irregular and rapid beating of the atria. This condition can cause complications such as blood clots or cardiomyopathy. However, a suspected but not yet confirmed complication of AF is pulmonary embolism (PE), which is a blockage of a pulmonary artery by some material, usually blood clots. As of now, the majority of PE cases are caused by deep vein thrombosis (DVT), where blood clots form in the legs and travel (making them embolisms) into the lungs. My project will involve using the Myocardial Infarction Data Acquisition System, or MIDAS, to compile information on the cases which will help in finding the link between AF and PE. We will be evaluating different control and test cases based on patients who suffer different combinations of the diseases, such as:


Weekly Log

Week 1:
I met my mentor, Rutgers statistics professor Javier Cabrera, and discussed his areas of research. We discussed a current project he is working on, which involves creating personalized disease networks, or PDNs, to understand and predict cardiovascular diseases, among other complications. I also attended a meeting with Professor Cabrera, in which a few other graduate students presented the work they had accomplished so far for their own projects. A few doctors and biomedical students were also in attendence. So far, I have been reviewing a text suggeted by Professor Cabrera (An Introduction to Statistical Learning with Applications in R) in order to comprehensively learn the R programming language and be able to apply it in my project.
Week 2:
I talked with Dr. Kostis, and he explained to me the medical side of what my project will entail. I also met with Davit Sargsyan, MS, and he gave me an overview of the database I will be accessing for this project, along with other "maintenence" I will need to know. Most of my time has been devoted to learning R better and teaching myself higher level statistics in order to fully understand the work I will be doing in the upcoming weeks.
Week 3:
Because my project involves dealing with real patient data, I've had to jump through a few hoops in order to access it, such as getting certified to use patient data. There have been some issues regarding this, but in the meantime I've been working with simulated data to make progress on the main project. I've modelled some behavior that I will need on the real data set, such as simple linear regressions on patient ages, and histograms summarizing various important data sets.
Week 4:
I've begun to work on analyzing and extracting information on comorbidity (the presence of diseases in addition to a main disease). However, this involves implementing new code which will parse through ICD-9 codes (these codes are used to classify and report diseases and procedures) in order to find comorbidities of the patients. From there, I should be able to choose the specific comorbidity cases I am looking for (such as PE and AF). Since Evaristo Rodriguez is working on a seperate project that also involves implementing this comorbidity analysis, we will likely work together to understand the code we will use.
Week 5:
Integrating the comorbid function into my code has been a bit more difficult than I anticipated. This also has been made more challenging due to the fact that after my last meeting with Davit, he explained to me that I should focus on additional diseases that may influence venous thromboembolism (the combination of DVT and PE). Soon I should be ready to use real patient data in order to get results instead of just working with the simulated data.
Week 6:
This week was mostly spent on working on the final presentation, however I did make progress with the comorbidity analysis. By creating my own mapping file (which is used to turn ICD-9 codes into their corresponding diseases), I used my existing code to sort patients by what diseases they were admitted for in order to sort data into the cases I wish to investigate. Concerning the final presentation, Evaristo and I presented together as both of our projects involved working at the Cardiovascular Institute with similar aspects, such as ICD codes.
Week 7:
This week I started to work on my final report for my project, along with some minor progress with my project. Since this is likely the last week I can meet with my mentor before the REU ends I gave a brief presentation on my work to a few members of the Cardiovascular Institute. Professor Cabrera suggested I look into modelling some of my results using Cox regression, which will involve me learning a bit more stats in order to accurately implement it.
Week 8:
TBA

Presentations


Additional Information