The Synthesis of Probabilistic Prediction & Mechanistic Modelling within a Computational & Systems Biology Context

Lead Research Organisation: University of Glasgow
Department Name: School of Computing Science

Abstract

The synergistic advances that can be made by the multidisciplinary interplay between abstracted computational modelling and biological experimental investigation within a system biology context are poised to make major contributions to our understanding of some of the most important biological systems implicated in the genesis of many serious diseases such as cancer. However, due to the unavoidable inherent levels of uncertainty, noise and relative scarcity of biological data it is vital that sound evidential based scientific reasoning be enabled within a systems biology context by formally embedding mechanistic models within a probabilistic inferential framework. The synthesis of mechanistic modelling & probabilistic inference provides outstanding opportunities to make further significant advances in understanding biological systems and processes at multiple levels, by defining system components and inferring how they dynamically interact. There is a major role that statistical machine learning methodology has to play in both computational & systems biology research and a number of important methodological challenges are presented by applications working at this interface.However, one of the most important aspects of successful computational & systems biology research is that it must be conducted in direct collaboration with world-class experimental biologists. An outstanding feature of this Fellowship is that it has set in place six exciting collaborations with internationally leading cancer researchers, proteomics technologists, biochemists and plant biologists who are all fully committed to successfully driving forward a potentially groundbreaking multidisciplinary systems biology research programme as detailed in this proposal. Three important application areas within biological science will shape and direct the research to be undertaken during this Fellowship. The applications are distinct, yet overlap in terms of the modelling & inferential issues which each present and this is important in ensuring a consistent and coherent line of research. They have also been selected for their major importance in the study of cellular mechanisms which are fundamental to cell function, some of which are implicated in certain serious diseases. In addition, the applicant has substantive ongoing collaborations with world-class laboratories engaged in these biological investigations. This ensures the proposed research programme is focused on realistic methodological problems which will have a direct impact on the major scientific questions being asked within each area, as well contributing to the computational and inferential sciences. The first application will develop the inferential tools required by cancer biologists when reasoning about the structures underlying the observed dynamics of the MAPK pathway and these tools will be employed in a large scale study of this pathway in collaboration with the Beatson Institute of Cancer Research. The second application, to be conducted with the Plant Sciences group at the University of Glasgow, will seek to elucidate, in a model-based inferential manner, the remarkable observed phenomenon of organ specificity of the circadian clock in soybean and Arabidopsis, in addition a study of models of transcriptional regulation in the cell-cycle will be conducted. The final application will investigate a number of open issues associated with clinical transcriptomics and proteomics where the identification of possible target genes and proteins is of vital importance to cancer researchers in their studies of, in this case breast and ovarian cancer. This study will be conducted in direct conjunction with the Institute of Cancer Research where an ongoing study of BRCA1&2 mutations implicated in breast and ovarian cancer is underway.

Publications

10 25 50
publication icon
Calderhead B (2009) Estimating Bayes factors via thermodynamic integration and population MCMC in Computational Statistics & Data Analysis

publication icon
Damoulas T (2009) Pattern recognition with a Bayesian kernel combination machine in Pattern Recognition Letters

publication icon
Filippone M (2011) A Perturbative Approach to Novelty Detection in Autoregressive Models in IEEE Transactions on Signal Processing

publication icon
Girolami M (2011) Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods in Journal of the Royal Statistical Society Series B: Statistical Methodology

publication icon
Girolami M (2008) Bayesian inference for differential equations in Theoretical Computer Science

publication icon
Polajnar T (2011) Protein interaction sentence detection using multiple semantic kernels. in Journal of biomedical semantics

publication icon
Psorakis I (2010) Multiclass relevance vector machines: sparsity and accuracy. in IEEE transactions on neural networks

publication icon
Rogers S (2009) Semi-parametric analysis of multi-rater data in Statistics and Computing

publication icon
Rogers S (2009) Infinite factorization of multiple non-parametric views in Machine Learning

publication icon
Rogers S (2009) Probabilistic assignment of formulas to mass peaks in metabolomics experiments. in Bioinformatics (Oxford, England)

publication icon
Vyshemirsky V (2008) BioBayes: a software package for Bayesian inference in systems biology. in Bioinformatics (Oxford, England)

publication icon
Vyshemirsky V (2008) Bayesian ranking of biochemical system models. in Bioinformatics (Oxford, England)

publication icon
Zhong M (2011) Bayesian Methods to Detect Dye-Labelled DNA Oligonucleotides in Multiplexed Raman Spectra in Journal of the Royal Statistical Society Series C: Applied Statistics

 
Description The synergistic advances that can be made by the multidisciplinary interplay between abstracted computational modelling and biological experimental investigation within a system biology context are poised to make major contributions to our understanding of some of the most important biological systems implicated in the genesis of many serious diseases such as cancer. However, due to the unavoidable inherent levels of uncertainty, noise and relative scarcity of biological data it is vital that sound evidential based scientific reasoning be enabled within a systems biology context by formally embedding mechanistic models within a probabilistic inferential framework. The synthesis of mechanistic modelling & probabilistic inference provides outstanding opportunities to make further significant advances in understanding biological systems and processes at multiple levels, by defining system components and inferring how they dynamically interact. There is a major role that statistical machine learning methodology has to play in both computational & systems biology research and a number of important methodological challenges are presented by applications working at this interface.However, one of the most important aspects of successful computational & systems biology research is that it must be conducted in direct collaboration with world-class experimental biologists. An outstanding feature of this Fellowship is that it has set in place six exciting collaborations with internationally leading cancer researchers, proteomics technologists, biochemists and plant biologists who are all fully committed to successfully driving forward a potentially groundbreaking multidisciplinary systems biology research programme as detailed in this proposal. Three important application areas within biological science will shape and direct the research to be undertaken during this Fellowship. The applications are distinct, yet overlap in terms of the modelling & inferential issues which each present and this is important in ensuring a consistent and coherent line of research. They have also been selected for their major importance in the study of cellular mechanisms which are fundamental to cell function, some of which are implicated in certain serious diseases. In addition, the applicant has substantive ongoing collaborations with world-class laboratories engaged in these biological investigations. This ensures the proposed research programme is focused on realistic methodological problems which will have a direct impact on the major scientific questions being asked within each area, as well contributing to the computational and inferential sciences. The first application will develop the inferential tools required by cancer biologists when reasoning about the structures underlying the observed dynamics of the MAPK pathway and these tools will be employed in a large scale study of this pathway in collaboration with the Beatson Institute of Cancer Research. The second application, to be conducted with the Plant Sciences group at the University of Glasgow, will seek to elucidate, in a model-based inferential manner, the remarkable observed phenomenon of organ specificity of the circadian clock in soybean and Arabidopsis, in addition a study of models of transcriptional regulation in the cell-cycle will be conducted. The final application will investigate a number of open issues associated with clinical transcriptomics and proteomics where the identification of possible target genes and proteins is of vital importance to cancer researchers in their studies of, in this case breast and ovarian cancer. This study will be conducted in direct conjunction with the Institute of Cancer Research where an ongoing study of BRCA1&2 mutations implicated in breast and ovarian cancer is underway.
Exploitation Route Cellular Biologists are routinely employing the Bayesian approach to modelling and statistically testing signalling pathways and a number of high profile publications in e.g. Science signalling, PNAS have appeared exploiting these results
Sectors Chemicals,Healthcare,Pharmaceuticals and Medical Biotechnology

 
Description Impact of Fellowship. Firstly let me consider the postdoctoral researchers and PhD student that were assigned to my fellowship. Dr Simon Rogers worked with me on the fellowship for two years before leaving to take up a permanent academic position (Lecturer) at the Department of Computing Science, University of Glasgow. His replacement Dr Maurizio Filiponne then worked on the research programme of the fellowship for the remaining year and he then secured a permanent academic position (Lecturer) in the Dept of Computing Science at the University of Glasgow as well. The PhD student Mr Gary Macindoe successfully defended his PhD thesis on June 2013 and is now working as a postdoctoral research assistant. My own career has developed superbly where I have received a number of awards - awarded a Royal Society Wolfson Research Merit Award (2012); elected to the Fellowship of the Royal Society of Edinburgh (2011); awarded the Pioneer Award from SPIE (2009) and in 2012 was successful in obtaining an EPSRC Established Career Fellowship. In 2010 I moved from the University of Glasgow to take a Chair in Statistics at University College London where I also was appointed to a professorial position in the Department of Computer Science at UCL and made Director of the Centre for Computational Statistics and Machine Learning. This was a major leap forward in seniority of my career and has presented further opportunities for my development. In terms of the research that I was able to undertake there are a number of highlights which are having ongoing impact however without doubt the paper which I wrote that was selected to be read before the Royal Statistical Society is a major success. The paper "Riemann manifold Langevin and Hamiltonian Monte Carlo methods" attracted the largest number of contributions to a 'read paper' in the history of the society (established 1834) and has already gathered over 130 citations making it the most downloaded article on the publishers website. The Science Signalling paper that was published with collaborators was a landmark in that MAPK pathway modelling and Bayesian inference for the first time was used to inform subsequent gene knockdown experiments and established the proof of principle of statistical inference informing subsequent biological experiments. My work with Mosaiques Diagnostics has helped to define the field of clinical proteomics in terms of statistical validity and standards of evaluation.
 
Description ASSET - Analysing and Striking the Sensitivities of Embryonal Tumours
Amount £494,760 (GBP)
Funding ID 259348 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 11/2010 
End 11/2015
 
Description ASSET - Analysing and Striking the Sensitivities of Embryonal Tumours
Amount £494,760 (GBP)
Funding ID 259348 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 11/2010 
End 11/2015
 
Description Advancing Machine Learning Methodology for New Classes of Prediction Problems
Amount £252,135 (GBP)
Funding ID EP/F009429/2 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 04/2008 
End 02/2012
 
Description Advancing the Geometric Framework for Computational Statistics: Theory, Methodology and Modern Day Applications
Amount £663,347 (GBP)
Funding ID EP/J016934/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 05/2013 
End 04/2018
 
Description Cross-Disciplinary Feasibility Account : Computational Statistics and Cognitive Neuroscience
Amount £263,822 (GBP)
Funding ID EP/H024875/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2010 
End 01/2012
 
Description ENGAGE : Interactive Machine Learning Accelerating Progress in Science, An Emerging Theme of ICT Research
Amount £674,580 (GBP)
Funding ID EP/K015664/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 02/2013 
End 01/2016
 
Description Inference-based Modelling in Population and Systems Biology
Amount £238,885 (GBP)
Funding ID BB/G006997/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 04/2009 
End 03/2012
 
Description Network on Computational Statistics and Machine Learning
Amount £104,530 (GBP)
Funding ID EP/K009788/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 06/2013 
End 06/2016
 
Description The Silicon Trypanosome
Amount £626,769 (GBP)
Funding ID BB/I004599/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 11/2011 
End 10/2013