§ DF Simola

digital projections >>

http://dfsimola.com/cv

Curriculum Vitae

 
§ site  posted 26 Sep 2005; modified 12 Aug 2009

My interests lie in understanding the complexity of interactions among development, evolution, and environment. Specifically, what is the overall developmental process architecture that generates an organism from its genome; how does this architecture structure the expression of evolutionary and environmental variation, and how has evolution crafted such an architecture.

Résumé

Contact

Daniel F. Simola

103I Lynch Laboratories
433 South University Avenue
University of Pennsylvania
Philadelphia, PA 19104
campus map | area map

Education

  • University of Pennsylvania
    Ph.D. candidate in Genomics and Computational Biology, September, 2009

    Dissertation: Evolution of Genome-Wide Gene Regulation in the Budding Yeast Cell-Division Cycle

  • Dartmouth College
    B.A. Computer Science with High Honors, 2003

  • St. Joseph's Preparatory School
    High school diploma, 1999

Publications

  • D. Simola & J. Kim. Evolutionary dynamics of the Saccharomyces genome and a resource for population genomics. In revision (2008).
  • D. Simola, C. Francis, P. Sniegowski, & J. Kim. Heterochrony drives transcriptome divergence in the cell-division cycle of woodland budding yeast. Submitted (2009).

Dissertation: Evolution of Genome-Wide Gene Regulation in the Budding Yeast Cell-Division Cycle

Genome-wide regulation of gene expression involves a dynamic epigenetic structure which generates an organism's life-cycle. Although changes in gene expression during development have broad effects on many basic phenomena including cell growth and differentiation, morphogenesis, and disease progression, the evolutionary forces influencing gene expression dynamics and gene regulation remain largely unknown, due to the nature of gene expression as a complex, quantitative trait. Moreover, since gene expression is regulated differentially over time, the effects of evolutionary forces may be influenced by developmental context. Thus, to advance the understanding of evolution in the context of the life-cycle, the architecture of gene expression timing control and its influence on gene expression dynamics must be revealed.

This dissertation presents two experimental investigations of the evolution of genes and related structural regions and time-dependent gene expression, using the budding yeasts Saccharomyces cerevisiae and Saccharomyces paradoxus and their mitotic cell-division cycles as model organism and life-cycle. Comparative methodologies are employed to analyze genome-wide patterns of genetic and phenotypic diversity within and between species.

Analysis of several dozen Saccharomyces genomes reveals that they are evolving under moderate to strong purifying selection. Despite limited genetic variability, differences in transcriptional regulation appear to have contributed the most to divergence between species, and changes in regulation of ribosomal genes may have altered the timing of each species' transition from vegetative growth to reproduction, a classic life-history trait. In addition, natural variation in gene expression was measured as a genome-wide time-series through the mitotic cell-division cycle of nine woodland lines of the budding yeast S. cerevisiae and one outgroup S. paradoxus. Despite levels of expression variation consistent with strong stabilizing selection, significant divergence of transcriptome coexpression dynamics within and between species was revealed at all modular scales of organization. A model involving timing pattern changes explains 61% of the between-genome transcriptome variation, suggesting that the major mode of transcriptome evolution involves changes in timing (heterochrony) rather than changes in levels (heterometry) of expression. Furthermore, comparative analysis of heterochrony patterns suggests an architecture for transcriptome timing control comprised of modular and dynamically- autonomous gene expression timelines, each subject to independent evolution.

Teaching experience

  • Sole teaching assistant for Principles of Computational Biology (GCB 536) in 2008 and 2009. I was responsible for generating homework assignments, grading exams, holding office hours, and leading review sessions.

Conference talks

Conferences

Software

I have written a variety of programs to handle everyday tasks such as batch file renaming and selecting rows from a delimited file, to bioinformatic scripts. The majority of code is Python, with some scabs of Perl, all of which is available to use or peruse. Please give credit where it's due, and apologies for the many imperfections...

  • Download the code
  • Or, for the latest versions, tap into my subversion repository:
    svn co svn://dfsimola.com/svn/simolacode/trunk simolacode

Graduate course work

  • ESE674: Information theory
  • GCB 531: Genomics
  • BIOL597: Developmental neurobiology seminar
  • BIOL446: Introductory statistics
  • BIOM600: Cell biology
  • BIOM555: Control of gene expression
  • CAMB550: Genetics
  • CIS520: Introductory machine learning
  • CIS700: Machine learning for bioinformatics
  • Seminar on sequence alignment (Sampath Kannan, Sridhar Hannenhalli)
  • GCB537: Seminar in phylogeny (Junhyong Kim)
  • Independent study: Modeling gene expression (Junhyong Kim)
  • Independent study: Population genetics (Warren Ewens)

Graduate Rotations

Dendritic Hotspots of Translational Activity

PI: Jim Eberwine

The functional characterization of the neurological mechanism effecting synaptic plasticity is still largely unknown. Recent experiments have shown that vertebrate neurons both sequester mRNA at dendritic terminals and have the capacity to translate this mRNA locally. It is likely that local translation machinery is utilized in dendrites to organize and mediate responses to presynaptic signaling. Such a mechanism could contribute to the formation and/or maintenance of a synaptic structure which is fundamental to learning and memory. Further experiments have presented temporal and spatial data corresponding to local translation events. Analysis of the temporal data suggests that dendritic ribosomal machinery can be stimulated to translate mRNA both at an exponential rate and a linear rate, in contrast to strictly linear translation rate in the cell soma. Analysis of spatial data of hotspots suggests that they could take up permanent residence at specific locations within dendrites.

Prediction of Transcription Factor Gene Expression in Yeast

PI: Junhyong Kim

Our goal is to predict the gene expression levels of known transcription factors in Saccharomyces cerevisiae. I used regression to evaluate linear mathematical models, based on the yeast transcriptional regulatory network and expression data from the cell cycle. A Fourier approach was taken, expressing the temporal expression profiles of genes as periodic waves. Each model’s explanatory variables are taken from a basis subset of transcription factor genes, whose proteins were found to regulate every downstream transcription factor gene. Thus each model represents the expression of a target gene as a weighted linear combination of waves.

Expression Analysis of Mouse Chromosome 5 and Presynaptic Transmssion Genes

PI: Maja Bucan

There are 350 known genes and over 150 putative genes that lie along a 77 megabase stretch of Mouse Chromosome 5. This region has been important in human disease modeling, as a balancer chromosome for this region is available, which facilitates mutagenesis. My goal was to create an expression map of this region, using microarrays developed in the lab, in conjunction with Novartis and other microarray data sources. Such an expression map will serve as a tool to search for functional correlations with disease genes, to validate putative genes, and to understand expression regulation in a continuous range of DNA.

Undergraduate Research

Dartmouth Senior Thesis: Discovery, Visualization and Analysis of Gene Regulatory Sequence Elements in Genomes

Advisor: Bob Gross

The advent of rapid DNA sequencing has produced an explosion in the amount of available sequence information, permitting us to ask many new questions about DNA. There is a pressing need to design algorithms that can provide answers to questions related to the control of gene expression, and thus to the structure, function, and behavior of organisms. Such algorithms must filter through massive amounts of informational noise to identify meaningful conserved regulatory DNA sequence elements.

We are approaching these questions with the notion that visualization is a key to exploring data relationships. Understanding the exact nature of these relationships can be very difficult by simply interpreting raw data. The ability to look at data in a graphical form allows us to apply our innate capacity to think visually to discern the subtle relationships that might not be recognizable otherwise. This thesis provides computational tools to visually identify and analyze candidate motifs in the DNA of a species.

This includes a parsing utility to store genomic data and an application to search for and visually identify motifs. Using these tools, novel and previously compiled gene sets were identified using the genome of the plant species Arabidopsis thaliana.