Constructing summary statistics from composite likelihoods for approximate Bayesian computation

Lead Research Organisation: Newcastle University
Department Name: Sch of Maths, Statistics and Physics

Abstract

To generate realisations from a posterior distribution, methods such as Markov chain Monte Carlo and sequential Monte Carlo rely on evaluation of the likelihood function. However, for many complex models the likelihood is computationally intractable. Approximate Bayesian computation (ABC) and composite likelihoods can be useful tools to allow inference to proceed when the likelihood is not available.

When data can easily be simulated from a model, ABC provides an approximation to the posterior distribution, avoiding evaluation of the likelihood, by measuring the similarity between the observed data and simulated data. A basic ABC rejection scheme simulates data using parameter values sampled from the prior distribution, and accepts the parameter values if the distance between vectors of summary statistics for the observed data and simulated data is smaller than some pre-defined tolerance. The choice of summary statistics used within ABC can greatly affect the quality of the approximate inference.

If the full likelihood is unavailable, but evaluation of the likelihood for some subsets of the data is straightforward, then a composite likelihood can be used to replace the full likelihood. A composite likelihood is a weighted product of valid likelihood terms, corresponding to a collection of marginal or conditional events. The weights in a composite likelihood can be set equal and ignored. However, a carefully selected set of weights can improve statistical efficiency. If a composite likelihood is used directly in Bayes' theorem the variability in the posterior is often greatly underestimated, with the composite posterior being excessively concentrated. Calibration methods have been proposed to adjust the composite likelihood for use in Bayes' theorem. However, such methods can result in the calibrated composite posterior being too dispersed.

Previous work combining composite likelihoods and ABC suggests using the composite score as the summary statistic for ABC. To implement this approach a suitable composite likelihood, including the associated weights, must first be chosen. For some models there may not be a clear choice for the weights, or there may be several candidate composite likelihoods to choose from. The aims of this project are: to explore new methods for constructing summary statistics for ABC from composite likelihoods, to develop methods for constructing summary statistics such that the associated weights in the composite likelihood can be chosen automatically, to develop techniques that combine multiple composite likelihoods for contribution to summary statistics, and to ultimately develop a thorough methodology that will allow automatic construction of a summary statistic to use within ABC to allow efficient and accurate inference for models for which at least one composite likelihood can be defined. Possible applications for this new methodology include spatial extremes and space-time models.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R51309X/1 01/10/2018 30/09/2023
2123488 Studentship EP/R51309X/1 01/10/2018 08/08/2022 Rosabeth White
 
Description The work funded by this award has developed new methodology within the area of likelihood-free simulation-based inference. The methodology developed (stacked composite score approximate Bayesian computation) describes a new technique for constructing summary statistics for use in approximate Bayesian computation to provide inference for the parameters of complex models.

The work has developed theoretical support for the new methodology. The theory provides the optimal, in terms of asymptotic efficiency, construction of composite score based summary statistics.

The work has developed guidelines for the practical implementation of the new methodology, including a summary statistic dimension reduction technique, and an adaptive algorithm.

Simulation experiments have identified potential applications to spatial models, time series models, and population genetic models. The experiments have demonstrated an improved quality of approximate inference over existing Bayesian composite likelihood methods.
Exploitation Route The new methodology developed may be used by statisticians to draw insights from data. The work has identified a number of possible areas of application including inference in financial time series models, and inference of the recombination rate in population genetics models.

There is potential for further academic research to extend the methodology for a wider range of applications.
Sectors Other

 
Title Stacked composite score approximate Bayesian computation 
Description This new algorithm contains a novel approach to constructing summary statistics for ABC. The proposed approach uses the stacked composite score as the summary statistic. Asymptotic theory suggests this to be the optimal construction of a composite score based ABC summary statistic. For finite sample applications, the algorithm can be applied with a lower dimensional summary statistic, constructed by grouping together similar terms within the stacked composite score. For example, in analysis of time series data with the pairwise likelihood, terms can be grouped according to the time lag between pairs of observations. 
Type Of Material Computer model/algorithm 
Year Produced 2021 
Provided To Others? No  
Impact No notable impacts.