Our goal is to enable predictive medicine and rational bioengineering. This requires (1) the ability to predict how complex biological behaviors arise from individual molecules and their interactions and (2) the ability to reverse engineer biological systems to have specific phenotypes. Although great progress has been made over the past several decades in the biochemical characterization of individual cells and organisms due to the advent of high-throughput measurement technologies, much work remains to understand and predict how phenotypes emerge. Novel computational techniques which integrate heterogeneous data and mathematics are needed to comprehend the overwhelming complexity of biology. These techniques will not only allow us to predict phenotype from genotype, but ultimately enable us to engineer genotypes to have desired phenotypes.
Under this general theme of understanding how complex phenotypes emerge from the detailed biochemistry inside individual cells, our interests are to:
- Develop the detailed, comprehensive computational models that will be required for predictive medicine and bioengineering.
- Use large-scale models to drive biological discovery using experiments guided by model predictions. For example, use models to predict how complex cellular behaviors like growth are controlled at the molecular level and test these predictions experimentally. Investigate which enzymes exert the most control over the rate of cellular growth. Investigate the maximum rate at which an organism can grow. Identify the optimal distribution of gene expression which maximizes cellular growth.
- Develop novel computational frameworks, algorithms, databases, and analysis tools which enable large-scale hybrid models.
- Explore new paradigms for collaboratively developing large-scale models.
Recently we developed and validated the first "whole-cell" model of the life cycle of a single cell in collaboration with Markus Covert and Jayodita Sanghvi. Toward this goal we've also developed new methods for model and data integration, model fitting and parameter estimation, and exploratory data analysis.
Please see below for more information about our recent projects. All of the software and data we've developed through these efforts is listed on the Resources page.
Whole-cell computational model of M. genitalium
As a proof-of-principal for the detailed computational models that are needed for rational bioengineering and predictive medicine, we recently developed a model of M. genitalium, one of the smallest known freely culturable organisms.
The model accounts for the specific function of every annotated gene product and predicts dynamics of every molecular species over the entire cell cycle. The model is composed of 28 sub-models of cellular processes including replication, transcription, and translation. Each sub-model was implemented using different mathematics and trained using different experimental data. We validated the model against a broad range of publicly available data. We've used the model to gain several novel insights into cell physiology. The animation illustrates the predicted life cycle of one in silico cell.
Developing of the model afforded us an opportunity to think about how to build large hybrid models. As a result, constructing the model required us to develop a new computational platform, new algorithms, and new tools for data analysis. We believe these new methods and tools will be broadly applicable to the development of similarly comprehensive models of more complex organisms, and will ultimately form the basis of bio-CAD systems for bioengineering and medicine.
Integrated computational model of E. coli
As a proof of principal for a whole-cell model composed of sub-models of distinct cellular processes, we developed a hybrid model of E. coli. This model consisted of a flux balance analysis model of global metabolism, an ODE model of central carbon metabolism, and a Boolean model of transcriptional regulation. The animation below displays the models predictions for the metabolic flux distribution of an E. coli culture growing in media containing glucose and lactose.
This model demonstrated the power of integrating multiple computational submodels to create a single more predictive model. Perhaps more importantly, the model provide us our first opportunity to investigate the mathematics and computational structures required for hybrid computational models.
Dynamical data visualization
Although biology is highly structured and highly dynamic, most data analysis tools ignore the network structure of biological data. Consequently, most analysis tools fail to take full advantage of biologists' prior knowledge and intuition.
We have developed web-based software for visualizing complex, dynamic data such as flow cytometry. The software provides a simple interface for drawing biological pathway diagrams, and for combining them with experimental data to produce animated pathway diagrams. The animation below displays flow cytometry T-cell data collected by Jonathan Irish from the Nolan Lab.
Most importantly, the software enables biologists to take full advantage of their prior knowledge and intuition when analyzing their data. Additionally, the software also enables scientists to quickly and easily explore very high dimensional datasets. We anticipate that this software will be broadly useful to researchers working with high-dimensional biomolecular data.