Uncertainty Analysis for Random Computer Models

Lead Research Organisation: Durham University
Department Name: Mathematical Sciences


PROJECT SUMMARYAnalysis of mathematical models is one of the principle tools for studying complex systems. Uncertainty analysis for computer models of such systems is concerned with system calibration, i.e. learning about system inputs from system data, and system forecasting, i.e. learning about system outputs. Such analysis must address many sources of uncertainty, relating to imperfect knowledge of model input values, limited knowledge of the form of the model, discrepancies between the model and the system and errors in system data. These problems are particularly acute for high dimensional models which are expensive in computer time to evaluate. A general Bayesian approach has been developed which treats all of these uncertainties in a unified manner. As yet, the method is not well suited to analyse high dimensional random computer models, i.e. those which which, when evaluated repeatedly for the same input, will give different outputs. This project concerns methodology which will allow us to incorporate the treatment of such models within the standard Bayesian approach to computer modelling, while remaining tractable even for high dimensional input and output spaces. One of the key features of the general Bayesian approach, when we cannot fully evaluate the function of interest, is to express uncertainty about the function values using an emulator, which specifies such uncertainty by combining global regression modelling with residual Gaussian process forms. We shall develop emulators both for the full joint probability distributions produced by the computer model, and also for useful summary quantities from the distribution. In particular, for high dimensional problems, we will exploit the simplifications of the Bayes linear approach to retain tractability in our inferences. This approach will also exploit related developments concerning uncertainty analysis for multilevel models, as we can view sample size for given inputs as determining the accuracy level of the model. Given the emulator, we will develop methods for system calibration and forecasting, for a given collection of simulator evaluations and system data. We also address the design problem of choosing the values of the inputs, and the number of repetitions for each choice, at which we shall evaluate the function. Design will be batch sequential, moving the focus from designs for building the emulators to designs for testing the emulators to designs for solving the inferential problems for the system when the emulator is stable. The Bayes linear approach will be used to retain tractability for design calculations for high dimensions. The approach is general but we will focus on two substantial and important applications. The first arises in high energy physics where the aim is to improve understanding of the behaviour of subatomic particles based on models and data arising in high energy particle collisions. The second arises in systems biology where we are concerned with identification of large biochemical network models exploiting data based on gene expression levels.
Description We discovered methods for analysing and understanding extremely complex models of the universe, of biology including genes and metabolisms and of many other physical systems. These methods allow us to understand all of the major uncertainties present in such systems, and hence can be used to reduce our uncertainty about the most important parts of the real world physical system.
Exploitation Route The Bayesian emulation methodology is widely applicable and we are currently pushing it out into many sciences (cosmology, systems biology, environmental science, oil reservoir engineering, epidemiology) all of which I am actively involved in. Our emulation methods have already been commercialised in the oil industry.
Sectors Chemicals,Digital/Communication/Information Technologies (including Software),Energy,Environment,Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology,Transport

Description Our Bayesian emulation methodology has already been used to perform uncertainty analyses in cosmology, systems biology (for designing future experiments), environmental science, climate science, oil reservoir engineering and epidemiology.
First Year Of Impact 2009
Sector Chemicals,Digital/Communication/Information Technologies (including Software),Energy,Environment,Other
Impact Types Cultural

Description Bayesian Analysis of Epidemiology HIV Models 
Organisation London School of Hygiene and Tropical Medicine (LSHTM)
Country United Kingdom 
Sector Academic/University 
PI Contribution We helped write the MRC proposal entitled "Calibration and analysis of complex models: methods development and application to explore HAART impact on HIV in Africa" for which Richard White (LSHTM) was PI. This was successful and commenced in Oct 2012 and funded 30 percent of my salary for two years, along with other expenses. The research outlined in this proposal involves the emulation of stochastic HIV epidemiology models: this is a direct consequence of the work I did on my EPSRC grant.
Collaborator Contribution They provided complex models of HIV in Uganda for our use, along with substantial computational resources and a full time postdoc to carry out the main part of the project.
Impact Major advances made in the analysis of HIV models. First publication entitled "Bayesian history matching and calibration of complex infectious disease models using emulation: a tutorial and a case study on HIV in Uganda" (see full list of publications) and two more in prep.
Start Year 2012
Description Bayesian Uncertainty Analysis in Systems Biology 
Organisation Durham University
Country United Kingdom 
Sector Academic/University 
PI Contribution We developed Bayesian emulation techniques to analysis systems biology models of gene and metabolic networks. Specifically for the problem of both parameter searching and of the design of the most informative future experiments. This has solved two major problems in the systems biology area.
Collaborator Contribution Our partners provided several systems biology models to use in our analysis, experimental data gather previously, and most importantly performed additional lab experiments on arabidopsis plants as directed by our Bayesian design of experiments approach.
Impact Definitely multidisciplinary as it involves Bayesian statisticians in the department of mathematical sciences (Vernon, Goldstein), and biologists (Liu, Lindsey). Two major research problems have been solved. We obtained seed corn funding for this project which allowed us to present a 3 hour workshop on our techniques at the major annual Systems Biology conference (ICSB 2012, Toronto). Three papers on this topic are in various stages of preparation. We will soon prepare a substantial BBSRC proposal.
Start Year 2011
Description Bayesian Uncertainy in Galaxy Formation 
Organisation Durham University
Country United Kingdom 
Sector Academic/University 
PI Contribution We developed Bayesian Computer Model Uncertainty emulation techniques for application to galaxy formation simulations, for use in parameter search problems. Our techniques solved a major problem in this area and have led to several publications, and a major research prize.
Collaborator Contribution Our partners in the Institute for Computational Cosmology provided several complex galaxy formation simulation models for us to use, and provided a large amount of computational resources for our use, along with much contextual expertise.
Impact Definitely multi-disciplinary: involving Bayesian statisticians (Vernon, Goldstein) and several cosmologists (Carlos Frenk, Richard Bower, Shaun Cole, Carlton Baugh, Cedric Lacey, Andrew Benson). It has led to 5 publications (in Bayesian Analysis, MNRAS, Statistical Science), and most importantly to the award of the top worldwide prize for Bayesian statistics: the Mitchell Prize for an invited discussion paper that I was first author on, "Galaxy Formation: a Bayesian Uncertainty Analysis". In addition we wrote an invited review article in Statistical Science, a top stats journal.
Start Year 2006