Towards Verification of Bayesian Inference on Probabilistic Programs

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

Context: This research project is concerned with probabilistic programming. Probabilistic programs are a way to express statistical models as computer programs and to automate statistical inference methods on them. In this way, probabilistic programming simplifies Bayesian modeling for applied statisticians and other scientists. Therefore, it has had substantial impact on statistical modeling already, with tools like "Stan" (a probabilistic programming system) being used by statisticians, for example modeling the spread of Covid. There are also applications to machine learning as the Bayesian approach makes it easier to quantify uncertainty.

Aims: This particular project aims to verify the inference results obtained by existing statistical inference algorithms. This is desirable because there are classes of models where existing methods perform poorly and this is sometimes difficult to detect in practice. The potential impact of this project is to help find bugs in existing inference methods and to help develop new ones.

Novelty of the methodology: This project seeks to achieve these objectives with a new class of inference methods that provide guaranteed bounds on the inference result. As such, they occupy a middle ground between exact inference methods (which always give the correct result but are rarely applicable) and approximate methods (which can always be applied but may take a long time to converge to the correct result for some models). Such guaranteed bounds are obtained by means of "abstract interpretation", a well-known concept in program verification. This project is exploring a number of instances of this technique: "interval traces", "probability generating functions", and others. This is augmented with optimizations exploiting linear structure, which is common in statistical models and thus probabilistic programs. Results that have already been published suggest that such techniques can be superior to statistical validation methods and are able to handle some programming language features, such as recursion, better than existing methods. There is also the potential in combining these guaranteed bounds with approximate (randomized) inference methods. The bounds could be improved based on the sample density of approximate inference results. Conversely, guaranteed bounds could inform approximate inference methods like importance sampling by providing global information about the distribution of probability mass.

EPSRC research areas: This project falls within the EPSRC research areas of "Programming languages and compilers", "Verification and Correctness", and "Statistics and applied probability".

Collaborators: Part of this project was carried out with Raven Beutner from CISPA Helmholtz Center for Information Security in Germany. The project is supervised by Luke Ong.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R513295/1 01/10/2018 30/09/2023
2285273 Studentship EP/R513295/1 01/10/2019 31/08/2023 Fabian Zaiser