Integrative modelling of stochasticity noise heterogeneity and measurement error in the study of model biological systems

Lead Research Organisation: Newcastle University
Department Name: Mathematics and Statistics

Abstract

Recent breakthroughs in experimental technology have allowed the study of the dynamics of cell biochemistry at single-cell resolution. These studies confirm earlier theoretical predictions that such dynamics would be intrinsically stochastic. Such randomness in cellular behaviour has now been confirmed to be an important component in the observed heterogeneity of genetically identical cells cultured in a uniform environment. However, intrinsic stochasticity is just one source of heterogeneity in biological data, and others, such as minor unavoidable variations in environment and limitations in the measurement technology, can have an equally important effect. A major goal of modern biology is to build dynamic, predictive, quantitative models of the behaviour of biological systems. Computational models enable in silico testing of plausible biological hypotheses and help to establish a clearer understanding of the complex genetic and biochemical mechanisms at play. Despite technological advances, much biological data used to build, refine and test models consists of measurements on cell populations. Unless the models that we build properly reflect the multiple sources of heterogeneity in such data, it is difficult to use them in order to test model adequacy or refine the model parameters or structure to more accurately reflect the underlying biology. The only coherent framework for mathematically modelling noise and heterogeneity is based on probability theory, and especially the theory of stochastic processes. Single-cell stochastic kinetic models are already established as a powerful tool for modelling intrinsic noise in simple genetic and biochemical networks. However, in such cases the emphasis is almost exclusively focused on the effect of noise on single-cell dynamics. It is important to now move beyond this and build integrated stochastic models that incorporate multiple sources of noise and heterogeneity. Stochastic models which integrate multiple levels of organisation are therefore a vital part of the vision for Systems Biology. This proposal comes from a statistician already expert in the necessary mathematical, computational and stochastic modelling techniques. The plan is to train him in modern molecular and cell biology, and associated experimental techniques. The idea is not to 'convert' him into a bench biologist, but by spending time in the lab he will better appreciate the practical issues confronted by 'wet lab' scientists, and will be in a much better position to be able to accurately model the multiple sources of heterogeneity that give rise to experimental data. This will allow the development of a number of examples of integrative stochastic models of biological systems, in order to demonstrate the utility of the approach.

Technical Summary

The proposal is to train an expert stochastic modeller (but novice biologist) in modern cell and molecular biology theory and experimental techniques, primarily through embedding in two excellent labs. One lab works on studying noise and heterogeneity in Bacillus subtilis competence development, at the single-cell level, via fluorescence microscopy live cell imaging and flow cytometry, using GFP reporters. As well as shadowing research assistants and technicians, the fellow will conduct a small experimental programme of his own, and use the insights gained for building more realistic models of the bi-stable Bacillus competence system. The models developed will lead to new hypotheses about the mechanisms underlying competence development. The models will also suggest how the hypotheses can be tested in the lab, leading to further refinement of the models. The second lab works on studying the cellular response to telomere uncapping in Saccharomyces cerevisiae. As well as using a variety of more conventional molecular biology techniques, the lab specialises in the use of high-throughput technology for large-scale study of genetic and environmental effects on the damage response phenotype. In particular, they have a state-of-the-art robotic system for genome-wide screening of mutants which generates large amounts of semi-quantitative data with a complex error structure. Accurate statistical modelling of the whole-system behaviour is a non-trivial challenge, and requires a detailed working knowledge of the robot. Again, the fellow will begin by shadowing members of the lab before conducting his own genome-wide screening experiments on the robot. The experience will be used to build accurate statistical models of the data, and then use them to conduct inference for mechanistic stochastic kinetic models of the telomere-uncapping response. Again, an iterative process of modelling and lab investigation of in silico predicted hypotheses will be employed.

Publications

10 25 50

publication icon
Büchel F (2012) Qualitative translation of relations from BioPAX to SBML qual. in Bioinformatics (Oxford, England)

publication icon
Heydari J (2016) Bayesian hierarchical modelling for inferring genetic interactions in yeast. in Journal of the Royal Statistical Society. Series C, Applied statistics

publication icon
Jow H (2014) Bayesian identification of protein differential expression in multi-group isobaric labelled mass spectrometry data. in Statistical applications in genetics and molecular biology

publication icon
Lawrence, Neil D.; Girolami, Mark; Rattray, Magnus; Sanguinetti, Guido (2010) Learning and Inference in Computational Systems Biology

 
Description The principal aim of the Fellowship was to train a statistician/mathematical modeller with no pre-existing laboratory experience in modern experimental biology, in order to provide him with a deeper understanding of the processes leading to quantitative and semi-quantitative experimental data on important cellular processes at the level of single cells. This was facilitated through collaboration with two leading experimental labs at Newcastle University, one specialising in budding yeast genetics relevant to telomere biology, and the other focussed on key decision processes in the Gram positive bacterium Bacillus subtilis. The fellow worked extensively in the two labs, learning basic modern molecular biology techniques, as applied to two different single-celled model organisms (one eukaryotic and one prokaryotic).



He learned how to genetically manipulate the organisms and integrate fluorescent reporter genes. He also learned to conduct experiments and assays exhibiting a heterogeneous response across a population of cells from an isogenic cell culture. Single cell techniques such as flow cytometry and time lapse fluorescence microscopy were used to generate information on the dynamics of intra-cellular processes. The data can be quantified and calibrated and used to help parameterise mathematical models of the biological processes. Stochastic single cell models are required to effectively capture the noise apparent in the single cell dynamics and the heterogeneity of the cell population response. However, in order to do this effectively, it is necessary to carefully stochastically model the relationship between the cellular processes of interest and the available reporter data. This requires a clear understanding of both the associated biology and experimental technologies used to generate the data.



The fellow has used these techniques to gain new insight into the complex regulation of key cellular decision processes in B. subtilis and mechanisms of telomere protection and maintenance in budding yeast. As well as publishing the results in the regular scientific literature, he has developed open source software to implement the methods, and documented the algorithms carefully in a book.
Exploitation Route The research findings can be developed by others in the usual incremental, scientific way. However, the main lesson for others is that it is possible for someone from a quantitative background to learn a lot about modern biology, and that this is very helpful for building productive inter-disciplinary collaborations.
Sectors Digital/Communication/Information Technologies (including Software),Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description The primary impact of this Fellowship was the personal development of the Fellow. This was a success - the Fellow continues to be active at the interface between the quantitative and life sciences, assuming a number of leadership roles. The impact of the research is harder to judge, but the main position paper has been cited more than 200 times (according to GS), and the associated new textbook edition has sold more than 1,000 copies and has been very highly cited. This suggests a broad impact of the work, beyond academic research, into teaching and wider society.
First Year Of Impact 2010
Sector Digital/Communication/Information Technologies (including Software),Environment
Impact Types Societal,Policy & public services

 
Description BBSRC Exploiting New Ways of Working Strategy Advisory Panel
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
URL http://www.bbsrc.ac.uk/about/structures/panels/enww-panel/
 
Description BBSRC Working Group on BSC/ENWW
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
 
Description EPSRC CDT in Cloud Computing for Big Data
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
URL http://teaching.ncl.ac.uk/ccfbd-cdt/
 
Description EPSRC/RSS Graduate Training Programme in Statistics (Short course on Systems Biology Modelling)
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers