Robust and scalable Markov chain Monte Carlo for heterogeneous models
Lead Research Organisation:
UNIVERSITY COLLEGE LONDON
Department Name: Statistical Science
Abstract
A large proportion of statistical inference tasks can be framed as either an optimisation or an integration problem. Markov chain Monte Carlo (MCMC) algorithms can be used to solve both, but are most commonly used for the latter. They have been successful in such important and diverse settings as the observation of gravitational waves, modelling the spread of infectious diseases, and predicting the results of elections from political polling data. MCMC algorithms are also popular outside of statistical inference, in particular their use is widespread for molecular dynamics simulations in statistical physics.
Despite their numerous successes, current MCMC algorithms has some known drawbacks. A prominent example is their performance when model parameters vary over very different scales and exhibit multiple levels of inter-dependence (heterogeneous models). There is an increasingly urgent need to improve this performance as high volume and highly heterogeneous datasets become more and more available, and as researchers begin to ask progressively more nuanced questions from their data for which heterogenous models are needed.
The standard approach to MCMC for heterogeneous models is through adaptive pre-conditioning of algorithms. Doing this naively in a high-dimensional setting comes at a significant cost (the required number of operations per algorithm step is often cubic in the number of model parameters, and the number of algorithmic tuning parameters to learn is quadratic). In addition, current state of the art algorithms such as Hamiltonian and Langevin Monte Carlo work particularly poorly in combination with the technique, as has recently been shown both theoretically and experimentally by myself and others. In this proposal I will attack this problem on two fronts.
In the first work package I will develop and study a new suite of MCMC algorithms that are specifically tailored to heterogeneous models. I will do this by designing algorithms based on the recently derived class of Markov processes termed 'locally-balanced', for which there is considerable evidence of improved robustness to model heterogeneity. I will provide a rigorous foundation for this class of Markov processes, establish key theoretical properties on convergence to equilibrium and optimality, and then design new algorithms based on this class of processes, each tailored towards specific application areas of known interest.
In the second work package I will develop new theoretically grounded methodology for scalable adaptive pre-conditioning of algorithms. I will do this in part by taking inspiration from the literature on sparse estimation of covariance matrices for high-dimensional datasets. I will design methods that are both scalable to high-dimensional settings and for which theoretical guarantees can be established, to provide a clear indication of expected performance gains. This should improve the applicability of existing state of the art methods such as Hamiltonian Monte Carlo to the high-dimensional and heterogeneous model setting.
There is a keen focus on integrating new methods within widely used statistical software within the proposal. To this end, I have planned collaborations with the founding developers of the 'Stan' statistical programming language, which has over 100,00 users, as well as detailed plans to create bespoke open source packages in software such as R and Python. I also outline plans to work closely with data scientists to apply the new methodology in many prominent application areas.
Despite their numerous successes, current MCMC algorithms has some known drawbacks. A prominent example is their performance when model parameters vary over very different scales and exhibit multiple levels of inter-dependence (heterogeneous models). There is an increasingly urgent need to improve this performance as high volume and highly heterogeneous datasets become more and more available, and as researchers begin to ask progressively more nuanced questions from their data for which heterogenous models are needed.
The standard approach to MCMC for heterogeneous models is through adaptive pre-conditioning of algorithms. Doing this naively in a high-dimensional setting comes at a significant cost (the required number of operations per algorithm step is often cubic in the number of model parameters, and the number of algorithmic tuning parameters to learn is quadratic). In addition, current state of the art algorithms such as Hamiltonian and Langevin Monte Carlo work particularly poorly in combination with the technique, as has recently been shown both theoretically and experimentally by myself and others. In this proposal I will attack this problem on two fronts.
In the first work package I will develop and study a new suite of MCMC algorithms that are specifically tailored to heterogeneous models. I will do this by designing algorithms based on the recently derived class of Markov processes termed 'locally-balanced', for which there is considerable evidence of improved robustness to model heterogeneity. I will provide a rigorous foundation for this class of Markov processes, establish key theoretical properties on convergence to equilibrium and optimality, and then design new algorithms based on this class of processes, each tailored towards specific application areas of known interest.
In the second work package I will develop new theoretically grounded methodology for scalable adaptive pre-conditioning of algorithms. I will do this in part by taking inspiration from the literature on sparse estimation of covariance matrices for high-dimensional datasets. I will design methods that are both scalable to high-dimensional settings and for which theoretical guarantees can be established, to provide a clear indication of expected performance gains. This should improve the applicability of existing state of the art methods such as Hamiltonian Monte Carlo to the high-dimensional and heterogeneous model setting.
There is a keen focus on integrating new methods within widely used statistical software within the proposal. To this end, I have planned collaborations with the founding developers of the 'Stan' statistical programming language, which has over 100,00 users, as well as detailed plans to create bespoke open source packages in software such as R and Python. I also outline plans to work closely with data scientists to apply the new methodology in many prominent application areas.
Publications
Faulkner M
(2024)
Sampling Algorithms in Statistical Physics: A Guide for Statistics and Machine Learning
in Statistical Science
Liang X
(2022)
Adaptive random neighbourhood informed Markov chain Monte Carlo for high-dimensional Bayesian variable selection
in Statistics and Computing
Liang X
(2023)
Adaptive MCMC for Bayesian Variable Selection in Generalised Linear Models and Survival Models.
in Entropy (Basel, Switzerland)
Vogrinc J
(2023)
Optimal design of the Barker proposal and other locally balanced Metropolis-Hastings algorithms
in Biometrika
| Description | We have discovered a general new class of Markov processes (locally-balanced Markov processes) that can be used to design sampling algorithms on very general spaces. These algorithms can be used to fit complex statistical/machine learning models. We have designed numerous sampling algorithms based on these processes and applied them to a range of application domains. This is the key output of work package 1. For work package 2 we have discovered clear situations in which commonly used techniques to improve sampling algorithms (diagonal preconditioning) will actually be worse than doing nothing at all. We have also established conditions on the problem in which the preconditioning is guaranteed to improve sampling performance. We have also developed cheap and efficient non diagonal preconditioners to assist in sampling problems for which diagonal preconditioners will not be sufficient. Finally we have created a software package to implement the methods developed in the packages, which is in R and freely available on CRAN under an open source software license. |
| Exploitation Route | The software package has generated interest and people are interested in using it and helping develop it further. We have understood how the method can be extended in various ways and are actively doing this at present. I intend to apply for future funding in Summer 2025 to continue with these projects. |
| Sectors | Aerospace Defence and Marine Chemicals Digital/Communication/Information Technologies (including Software) Energy Financial Services and Management Consultancy Healthcare Manufacturing including Industrial Biotechology Pharmaceuticals and Medical Biotechnology |
| Description | Additional 3 months PDRA time |
| Amount | £30,000 (GBP) |
| Organisation | University College London |
| Sector | Academic/University |
| Country | United Kingdom |
| Start | 02/2024 |
| End | 04/2024 |
| Description | Software development |
| Organisation | University College London |
| Department | Advanced Computing Research Centre (ACRC) |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | I managed an ARC software developer to create an R package to implement the methods developed in the grant. |
| Collaborator Contribution | The software developer helped to develop the package in collaboration with myself. |
| Impact | - Extensions to the package are planned for the future/still being developed - A Journal of Open Source Software paper is currently in submission - The collaboration is multi-disciplinary between Statistics, Machine Learning and Computer Science |
| Start Year | 2024 |
| Description | Visit to Flatiron Institute New York April 2024 |
| Organisation | Simons Foundation |
| Department | Flatiron Institute |
| Country | United States |
| Sector | Academic/University |
| PI Contribution | Visited Bob Carpenter and others at Flatiron, discussed ideas and new collaborations, and presented work to the Computational statistics group at a seminar. |
| Collaborator Contribution | An in-kind contribution was made by member of the team to the value of roughly $10k. |
| Impact | Presented work in a seminar, discussed future collaborations, disciplines involves are Statistics and Computer Science. |
| Start Year | 2024 |
| Description | Visit to Flatiron institute New York |
| Organisation | Simons Foundation |
| Department | Flatiron Institute |
| Country | United States |
| Sector | Academic/University |
| PI Contribution | I visited the institute for 2 weeks, gave a seminar, attended a workshop for 3 days, and discussed several research ideas with the staff. We are currently exploring projects related to these. |
| Collaborator Contribution | Sharing of research ideas, knowledge exchange. |
| Impact | No outputs yet, collaborations still ongoing. |
| Start Year | 2022 |
| Title | rmcmc R package |
| Description | rmcmc R package for adaptive and robust Markov chain Monte Carlo sampling. |
| Type Of Technology | Software |
| Year Produced | 2025 |
| Open Source License? | Yes |
| Impact | It's early days but people have contacted me about it and are interested in developing it further. |
| URL | https://github.com/UCL/rmcmc |
