Mentor: Gyan Bhanot, BioMaPS Institute and BME, Rutgers University

Project title: Finding the Sequence of "mtDNAEve"

Mitochondria (mtDNA) are structures found in the cytoplasm of eukaryotic cells responsible for vital cellular functions such as oxidative phosphorylation, release of Cytochrome C to initiate apoptosis etc. The current understanding of their origin derives from a proposal by Lynn Margulis in the 1970s, who argued for an extracellular origin due to an endosymbiosis event between bacteria and early prokaryotic cells. In complex organisms such as humans, mtDNA are maternally inherited without recombination and have a high rate of mutation (10X the mutation rate of nuclear DNA). These facts make a population analysis of their sequence an ideal method of tracing the maternal ancestry of geographically isolated populations to understand the origins and migratory history of humanity. Such analysis using clustering and phylogenetic techniques have shown that modern humans emerged from Africa in two or more migrations approximately 50-70 kYBP. The tree describing these events has its root in a sequence that represents “mitochondrial Eve”, who can be thought of as the ancestral mother of all of modern humans. Mapping the tree to world geography suggests that “mtDNA Eve” lived in Africa approximately 200 kYBP.

The project we propose is to use data on approximately 2000 mtDNA sequences from worldwide sources (available at ) to infer the mtDNA sequence of “mitochondrial Eve”. This would involve first constructing trees using methods such as parsimony, maximum likelihood, UPGMA, Neighbor Joining and Clustering and then inferring a “most likely sequence” for “mtDNA Eve” using these trees . Permutation tests on haplogroup (cluster) labels would be used to calculate a weight for each possible base assignment for each mtDNA locus. This would be based on an analysis of relative accuracy of each tree based on a probability measure using consensus accuracy of its internal branches across possible trees, given the observed clustering of sequences into robust haplogroups. The basis for this analysis is a recent paper submitted for publication where the mentor is a senior author.