Uncertainty Analysis for Random Computer Models
Lead Research Organisation:
Durham University
Department Name: Mathematical Sciences
Abstract
PROJECT SUMMARYAnalysis of mathematical models is one of the principle tools for studying complex systems. Uncertainty analysis for computer models of such systems is concerned with system calibration, i.e. learning about system inputs from system data, and system forecasting, i.e. learning about system outputs. Such analysis must address many sources of uncertainty, relating to imperfect knowledge of model input values, limited knowledge of the form of the model, discrepancies between the model and the system and errors in system data. These problems are particularly acute for high dimensional models which are expensive in computer time to evaluate. A general Bayesian approach has been developed which treats all of these uncertainties in a unified manner. As yet, the method is not well suited to analyse high dimensional random computer models, i.e. those which which, when evaluated repeatedly for the same input, will give different outputs. This project concerns methodology which will allow us to incorporate the treatment of such models within the standard Bayesian approach to computer modelling, while remaining tractable even for high dimensional input and output spaces. One of the key features of the general Bayesian approach, when we cannot fully evaluate the function of interest, is to express uncertainty about the function values using an emulator, which specifies such uncertainty by combining global regression modelling with residual Gaussian process forms. We shall develop emulators both for the full joint probability distributions produced by the computer model, and also for useful summary quantities from the distribution. In particular, for high dimensional problems, we will exploit the simplifications of the Bayes linear approach to retain tractability in our inferences. This approach will also exploit related developments concerning uncertainty analysis for multilevel models, as we can view sample size for given inputs as determining the accuracy level of the model. Given the emulator, we will develop methods for system calibration and forecasting, for a given collection of simulator evaluations and system data. We also address the design problem of choosing the values of the inputs, and the number of repetitions for each choice, at which we shall evaluate the function. Design will be batch sequential, moving the focus from designs for building the emulators to designs for testing the emulators to designs for solving the inferential problems for the system when the emulator is stable. The Bayes linear approach will be used to retain tractability for design calculations for high dimensions. The approach is general but we will focus on two substantial and important applications. The first arises in high energy physics where the aim is to improve understanding of the behaviour of subatomic particles based on models and data arising in high energy particle collisions. The second arises in systems biology where we are concerned with identification of large biochemical network models exploiting data based on gene expression levels.
Publications
Andrianakis I
(2015)
Bayesian history matching of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda.
in PLoS computational biology
Bower R
(2010)
Rejoinder
in Bayesian Analysis
Bower R
(2010)
Galaxy formation: a Bayesian uncertainty analysis
in Bayesian Analysis
Bower R
(2010)
The parameter space of galaxy formation The parameter space of galaxy formation
in Monthly Notices of the Royal Astronomical Society
Goldstein M
(2013)
Environmental Modelling - Finding Simplicity in Complexity
House L
(2009)
Second Order Exchangeable Computer Models
House L
Exchangeable Computer Models with Application to a Galaxy Formation Simulation (in prep)
in Journal of Uncertainty Quantification
Rodrigues L
(2017)
Constraints on galaxy formation models from the galaxy stellar mass function and its evolution
in Monthly Notices of the Royal Astronomical Society
Vernon I
(2014)
Galaxy Formation: Bayesian History Matching for the Observable Universe
in Statistical Science
Vernon I
(2010)
A Bayes Linear Approach to Systems Biology
Vernon I
Bayes Linear Emulation and History Matching of Stochastic Systems Biology Models (in prep)
in Journal of Uncertainty Quantification
Description | We discovered methods for analysing and understanding extremely complex models of the universe, of biology including genes and metabolisms and of many other physical systems. These methods allow us to understand all of the major uncertainties present in such systems, and hence can be used to reduce our uncertainty about the most important parts of the real world physical system. |
Exploitation Route | The Bayesian emulation methodology is widely applicable and we are currently pushing it out into many sciences (cosmology, systems biology, environmental science, oil reservoir engineering, epidemiology) all of which I am actively involved in. Our emulation methods have already been commercialised in the oil industry. |
Sectors | Chemicals,Digital/Communication/Information Technologies (including Software),Energy,Environment,Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology,Transport |
Description | Our Bayesian emulation methodology has already been used to perform uncertainty analyses in cosmology, systems biology (for designing future experiments), environmental science, climate science, oil reservoir engineering and epidemiology. |
First Year Of Impact | 2009 |
Sector | Chemicals,Digital/Communication/Information Technologies (including Software),Energy,Environment,Other |
Impact Types | Cultural |
Description | Bayesian Analysis of Epidemiology HIV Models |
Organisation | London School of Hygiene and Tropical Medicine (LSHTM) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We helped write the MRC proposal entitled "Calibration and analysis of complex models: methods development and application to explore HAART impact on HIV in Africa" for which Richard White (LSHTM) was PI. This was successful and commenced in Oct 2012 and funded 30 percent of my salary for two years, along with other expenses. The research outlined in this proposal involves the emulation of stochastic HIV epidemiology models: this is a direct consequence of the work I did on my EPSRC grant. |
Collaborator Contribution | They provided complex models of HIV in Uganda for our use, along with substantial computational resources and a full time postdoc to carry out the main part of the project. |
Impact | Major advances made in the analysis of HIV models. First publication entitled "Bayesian history matching and calibration of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda" (see full list of publications) and two more in prep. |
Start Year | 2012 |
Description | Bayesian Uncertainty Analysis in Systems Biology |
Organisation | Durham University |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We developed Bayesian emulation techniques to analysis systems biology models of gene and metabolic networks. Specifically for the problem of both parameter searching and of the design of the most informative future experiments. This has solved two major problems in the systems biology area. |
Collaborator Contribution | Our partners provided several systems biology models to use in our analysis, experimental data gather previously, and most importantly performed additional lab experiments on arabidopsis plants as directed by our Bayesian design of experiments approach. |
Impact | Definitely multidisciplinary as it involves Bayesian statisticians in the department of mathematical sciences (Vernon, Goldstein), and biologists (Liu, Lindsey). Two major research problems have been solved. We obtained seed corn funding for this project which allowed us to present a 3 hour workshop on our techniques at the major annual Systems Biology conference (ICSB 2012, Toronto). Three papers on this topic are in various stages of preparation. We will soon prepare a substantial BBSRC proposal. |
Start Year | 2011 |
Description | Bayesian Uncertainy in Galaxy Formation |
Organisation | Durham University |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We developed Bayesian Computer Model Uncertainty emulation techniques for application to galaxy formation simulations, for use in parameter search problems. Our techniques solved a major problem in this area and have led to several publications, and a major research prize. |
Collaborator Contribution | Our partners in the Institute for Computational Cosmology provided several complex galaxy formation simulation models for us to use, and provided a large amount of computational resources for our use, along with much contextual expertise. |
Impact | Definitely multi-disciplinary: involving Bayesian statisticians (Vernon, Goldstein) and several cosmologists (Carlos Frenk, Richard Bower, Shaun Cole, Carlton Baugh, Cedric Lacey, Andrew Benson). It has led to 5 publications (in Bayesian Analysis, MNRAS, Statistical Science), and most importantly to the award of the top worldwide prize for Bayesian statistics: the Mitchell Prize for an invited discussion paper that I was first author on, "Galaxy Formation: a Bayesian Uncertainty Analysis". In addition we wrote an invited review article in Statistical Science, a top stats journal. |
Start Year | 2006 |