Fast Updating of Bayesian Models

Lead Research Organisation: Imperial College London
Department Name: Mathematics

Abstract

Bayesian inference provides a powerful tool for modelling complex data and quantifying uncertainty, and has been used to solve problems in many areas including epidemiology, climatology, signal process-
ing, to name a few. In the applications of Bayesian inference, a common requirement is the ability to update Bayesian models quickly. This is especially desired when, for example, building models iteratively to reflect a change in the prior beliefs, to incorporate new observations into the model, or to correct the model after rectication of old data. A concrete example is in epidemiological modelling of the COVID-19 transmission, where a Bayesian model is built to predict the daily coronavirus cases, and new posterior fits are required every day as new data are collected and fed into the model. Another example is model development, where practitioners often wish to make adjustments to elements of the
model to account for changes in their prior believes.

In this setting, which we term iterative Bayesian modelling, practitioners often have had a posterior sample for the t of the previous model at their disposal, and wish to obtain a new sample for the updated model. The traditional approach to iterative Bayesian modelling is to run a Bayesian inference method from scratch each time the model is updated. Although this can yield state-of-the-art approximation accuracy, it is undesirable since these Bayesian inference methods are often computationally expensive, and the computation spent in fitting the old models would be wasted. Existing literature has provided theoretical results on when re-using old fits could help the fitting of a new model, but this remains an ununified area of research.

In this PhD project, we aim to explore different methods that can make use of the t to the old model to accelerate the fitting of the updated
model. Some objectives of this project are:
1. Designing algorithms that are able to update Bayesian models quickly under assumptions on the form of the changes to the model.
2. Reviewing and extending existing Bayesian inference methods to allow re-use of previous fits, and making comprehensive comparisons on their advantages and limitations.
3. Unifying the existing literature on the theoretical results of when re-using previous fits can be more beneficial than running a Bayesian method from scratch, thus providing theoretical justifications
to the use of these algorithms in iterative Bayesian modelling.

This project falls within the EPSRC research area of Mathematical Sciences. If successful, it can shed light on the development of programming software that allows rapid model updating or model
development, thereby beneting the wide community of practitioners of Bayesian statistics.
1

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2605897 Studentship EP/S023151/1 03/10/2020 30/09/2024 Xing Liu