An evaluation of how probabilistic and differentiable programming languages ought to be written in order to preserve correctness and flexibility

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

Probabilistic and differentiable programming languages are programming languages that
can work together to perform Bayesian machine learning. Bayesian machine learning is an
increasingly popular, model-based approach to machine learning, that has a mathematical
foundation which allows it to deal well with uncertainty and explainability. To emphasise
the increasing importance of probabilistic programming, it is noteworthy that probabilistic
and differentiable programming languages have recently been developed by high profile
corporations, such as Uber (Pyro), Facebook (HackPPL) and Google (Edward, Jax).
As probabilistic and differentiable programming languages become increasingly used in
important decision-making systems (e.g. insurance), more scrutiny will be applied to the
models, algorithms, and hardware reliability, but also the code and programming
languages themselves. If implementations of probabilistic and differentiable programming
languages can be shown to be correct with respect to a mathematical specification, then
this will allow society to have increased confidence in the predictions given by systems
which rely on probabilistic and differentiable programming languages.
Programming language semantics tie computer code that is written to a mathematical
domain, allowing for the rigorous assessment of the denotation and relative consistency of
a programming language. The utility of this notion is demonstrated by considering a
nuclear power station. If one runs code in a nuclear reactor, one would like to be able to
mathematically prove properties about the code in critical components. Semantics also
justify the validity of the way in which programs can be transformed (whilst preserving the
underlying semantics), which is important not only for compilers and optimisers, but also in
guiding the programmer in the first place. To date, PPLs tend to be implemented without
explicit focus to a particular semantics. In addition, PPLs tend to be written in languages,
which whilst easy to prototype in, are not easy to assess the correctness of with respect to
a semantics (using compile time guarantees, for instance). This project aims to assess
how probabilistic and differentiable programming languages can be implemented in
functional programming languages, which allow for extensive compile time guarantees on
the correctness of code. This will be performed by implementing a probabilistic and
differentiable programming library in Haskell, with explicit reference to a semantics. The
focus will not only be on the correctness of the programming language, but also on the
flexibility, compositionality and modularity of the code involved. To do this, this project will
investigate contemporary techniques in pure functional programming languages.
One example of a cutting-edge technique in programming languages research are
algebraic effects and their handlers. They are a language construct that allow
programmers to manipulate the composition of language features in a modular and
sophisticated manner, by elegantly separating syntax and semantics. This project will
investigate algebraic effects and handlers in context of probabilistic and differentiable
programming.
Once, a clearer idea of how these languages should be written is established, then an
investigation into how this applies to the discovery of new drugs will occur using
probabilistic programming. Representing molecules and their simulation will be a focal
point for this project. This project is interdisciplinary by nature and will use ideas from
(quantum) chemistry and structural biology. This project falls within the EPSRC
Information and Communication Technologies (Verification and Correctness) research
area. No private enterprise is currently involved in this research

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/T517811/1 01/10/2020 30/09/2025
2421783 Studentship EP/T517811/1 01/10/2020 30/09/2025