General Information
Student: |
Colton Fitzjarrald |
School: |
University of Missouri St. Louis |
E-mail: |
cwfdfd@umsl.edu |
Project: |
Genome folding and function: from DNA base pairs to nucleosome arrays and chromosomes |
Project Description
My work consists of looking into the growth patterns of the E. Coli ribosome. There is recent structural data
on this particular ribosome at several stages of its development. Using various techniques from mathematical
biology and knot theory we desire to understand how the topology of the E. Coli ribosome changes from birth
to its fully matured state.
Weekly Log
- Week 1:
-
There was an in-depth literature review. The bulk of this was a review in biology and
chemistry, specifically understanding the mechanisms of nucleic acids, rna, ribosomes, etc., in addition to the
more basic aspects of molecular and chemical biology. There was also a brief look into various techniques
used in bioinformatics and mathematical biology. Another large portion of this week was spent creating the
initial presentation to peers and faculty on my research intentions.
- Week 2:
-
Familiarized myself with the protein data bank (PDB) and current methods of data visualization in chemical
biology. I attempted to find an external c++ library to read through files in the PDB only to find very poor,
difficult to work with documentation. At the end of the week I decided to create a program to read through
the .cif and .pdb files myself.
- Week 3:
-
The majority of work this week consisted of developing a program to read through and extract information
from the .cif and .pdb files in the PDB. This quickly morphed from what I intended to be a simple program to
a relatively complex library of functions.
There was also a probing into the types of c++ external libraries for data visualization. It was found that
the current libraries available for this purpose were insufficient for my needs, while also containing
very poor documentation. At the end of the week a search was done to decide what external software would be
most efficient for the visualization aspect of my data and what was the most easily integrable with my current
design. The primary considerations were to integrate python or matlab.
- Week 4:
-
Python's library Seaborn was eventually decided as the best tool for visualization to integrate with the c++
library. An integration was completed between the two libraries, allowing for the final visualizations of the
data to commence. A program was also developed in c++ and added to the existing library to read through
.json files which contained base pair information.
The visualizations that were created: Scatter plots of base pair data, which outlined the residues within
the ribosomal structure that were connected via base pairs; Nucleic acid contact maps, which are binary
distance matrices of nucleic acid residues; Heatmaps of the initial 526 nucleic acid residues plotting their
relative distances against their residue number in the given stage of the ribosome.
A serious consideration is made at this point to determine if the libraries developed should be included on
Github to potentially assist other researchers who find the current c++ libraries in this specific area of
research to be lacking.
- Week 5:
-
This week consisted wholly of ongoing plot adjustments to refine desired details. There was also much
deliberation on what tools and methods should be considered for future modeling. It was decided that upon
completion of this portion of the project, work would commence in considering the writhing number, a
concept in knot theory often applied in topological chemistry, to analyze the ribosome structure.
- Week 6:
-
This week begins with heavy literature review into topological methods applied to molecular biology.
Presentations
Additional Information