§

§ DF Simola

digital projections

Curriculum Vitae

 
§ site  posted 26 Sep 2005; modified 11 Dec 2008

My academic interests lie in understanding patterns and processes of life, particularly the generative ontogenetic patterns of morphogenesis and differentiation, and their evolutionary modification. I believe computational approaches are a critical component of investigating these phenomena, specifically modeling and simulation using various genomic and otherwise comprehensive data using computational, mathematical, and statistical methods (and just enough bench work).

Résumé

Contact

Daniel F. Simola

103I Lynch Laboratories
433 South University Avenue
University of Pennsylvania
Philadelphia, PA 19104
campus map | area map

Dissertation: Evolution of gene expression in the cell-division cycle of woodland populations of budding yeast

Gene transcription events compose the primary decoding of an organism’s genetic program, forming the basis of phenotypic expression. Although changes in the regulation of gene expression have broad implications for the evolution of many basic biological phenomena, such as cell growth and proliferation, body plan formation, and environmental and genetic robustness, understanding which and how natural evolutionary processes affect gene expression remain largely unanswered, due to the nature of transcription as a complex quantitative trait. Moreover, as transcription is a dynamic process, with expression levels changing throughout an organism’s development, evolution likely influences a gene’s expression differentially as a function of time. Taking a comparative genomic approach, this dissertation aims to evaluate empirically transcriptome evolution in natural, woodland strains of the budding yeast Saccharomyces cerevisiae, during the mitotic cell-division cycle.

Education

  • University of Pennsylvania
    Ph.D. candidate in Genomics and Computational Biology

  • Dartmouth College
    B.A. Computer Science with High Honors, 2003

  • St. Joseph’s Preparatory School
    High school diploma, 1999

Software

I have written a variety of programs to handle everyday tasks such as batch file renaming and selecting rows from a delimited file, to bioinformatic scripts. The majority of code is Python, with some scabs of Perl, all of which is available to use or peruse. Please give credit where it’s due, and apologies for the many imperfections…

  • Download the code
  • Or, for the latest versions, tap into my subversion repository:
    svn co svn://dfsimola.com/svn/simolacode/trunk simolacode

Teaching experience

  • Sole teaching assistant for Principles of Computational Biology (GCB 536), Spring 2008. I was responsible for generating homework assignments, grading exams, holding office hours, and leading review sessions.

Talks

Conferences

Future Interests

Temporal and spatial pattern development has interested me for a long time, my mental mullings have led me towards thoughts of complexity. Recently I have begun thinking about the role robustness plays in organizing and maintaining systemic complexity. A recent paper by Erica Jen at the Santa Fe Institute best summarizes the entailments of robustness:

Exploring the difference between “stable” and “robust” touches on essentially every aspect of what we instinctively find interesting about robustness in natural, engineering, and social systems. It is argued here that robustness is a measure of feature persistence in systems that compels us to focus on perturbations, and often assemblages of perturbations, qualitatively different in nature from those addressed by stability theory. Moreover, to address feature persistence under these sorts of perturbations, we are naturally led to study issues including: the coupling of dynamics with organizational architecture, implicit assumptions of the environment, the role of a system’s evolutionary history in determining its current state and thereby its future state, the sense in which robustness characterizes the fitness of the set of “strategic options” open to the system; the capability of the system to switch among multiple functionalities; and the incorporation of mechanisms for learning, problem-solving, and creativity.

Graduate course work

  • ESE674: Information theory
  • GCB 531: Genomics
  • BIOL597: Developmental neurobiology seminar
  • BIOL446: Introductory statistics
  • BIOM600: Cell biology
  • BIOM555: Control of gene expression
  • CAMB550: Genetics
  • CIS520: Introductory machine learning
  • CIS700: Machine learning for bioinformatics
  • Seminar on sequence alignment (Sampath Kannan)
  • GCB537: Seminar in phylogeny (Junhyong Kim)
  • Independent study: Modeling gene expression (Junhyong Kim)
  • Independent study: Population genetics (Warren Ewens)

Graduate Rotations

Dendritic Hotspots of Translational Activity

PI: Jim Eberwine

The functional characterization of the neurological mechanism effecting synaptic plasticity is still largely unknown. Recent experiments have shown that vertebrate neurons both sequester mRNA at dendritic terminals and have the capacity to translate this mRNA locally. It is likely that local translation machinery is utilized in dendrites to organize and mediate responses to presynaptic signaling. Such a mechanism could contribute to the formation and/or maintenance of a synaptic structure which is fundamental to learning and memory. Further experiments have presented temporal and spatial data corresponding to local translation events. Analysis of the temporal data suggests that dendritic ribosomal machinery can be stimulated to translate mRNA both at an exponential rate and a linear rate, in contrast to strictly linear translation rate in the cell soma. Analysis of spatial data of hotspots suggests that they could take up permanent residence at specific locations within dendrites.

Prediction of Transcription Factor Gene Expression in Yeast

PI: Junhyong Kim

Our goal is to predict the gene expression levels of known transcription factors in Saccharomyces cerevisiae. I used regression to evaluate linear mathematical models, based on the yeast transcriptional regulatory network and expression data from the cell cycle. A Fourier approach was taken, expressing the temporal expression profiles of genes as periodic waves. Each model’s explanatory variables are taken from a basis subset of transcription factor genes, whose proteins were found to regulate every downstream transcription factor gene. Thus each model represents the expression of a target gene as a weighted linear combination of waves.

Expression Analysis of Mouse Chromosome 5 and Presynaptic Transmssion Genes

PI: Maja Bucan

There are 350 known genes and over 150 putative genes that lie along a 77 megabase stretch of Mouse Chromosome 5. This region has been important in human disease modeling, as a balancer chromosome for this region is available, which facilitates mutagenesis. My goal was to create an expression map of this region, using microarrays developed in the lab, in conjunction with Novartis and other microarray data sources. Such an expression map will serve as a tool to search for functional correlations with disease genes, to validate putative genes, and to understand expression regulation in a continuous range of DNA.

Undergraduate Research

Dartmouth Senior Thesis: Discovery, Visualization and Analysis of Gene Regulatory Sequence Elements in Genomes

Advisor: Bob Gross

The advent of rapid DNA sequencing has produced an explosion in the amount of available sequence information, permitting us to ask many new questions about DNA. There is a pressing need to design algorithms that can provide answers to questions related to the control of gene expression, and thus to the structure, function, and behavior of organisms. Such algorithms must filter through massive amounts of informational noise to identify meaningful conserved regulatory DNA sequence elements.

We are approaching these questions with the notion that visualization is a key to exploring data relationships. Understanding the exact nature of these relationships can be very difficult by simply interpreting raw data. The ability to look at data in a graphical form allows us to apply our innate capacity to think visually to discern the subtle relationships that might not be recognizable otherwise. This thesis provides computational tools to visually identify and analyze candidate motifs in the DNA of a species.

This includes a parsing utility to store genomic data and an application to search for and visually identify motifs. Using these tools, novel and previously compiled gene sets were identified using the genome of the plant species Arabidopsis thaliana.