Inductive bias selection in Bayesian models

Lead Research Organisation: University of Oxford

Abstract

This project falls within the EPSRC Mathematical Sciences research area.

Over the past few years, machine learning models, with neural networks at the forefront, have achieved state-of-the-art performance in a variety of tasks. Among the factors that have contributed to such breakthroughs are significant architectural innovations. Depending on the data one wants to model and the task one wants to solve, extremely specialised models have been developed. For instance, convolutional neural networks (CNNs), recurrent neural networks (RNNs), and graph neural networks (GNNs) are especially suited for image, sequential, and graph data, respectively. Compared to more general models, such as multi-layer perceptrons (MLPs), these specialised models are much more effective in modelling the data they are designed for, despite not necessarily possessing a higher capacity. In other words, while all these models might have the ability to fit the training data equally well, they differ enormously in how they generalise to unseen data. Such a difference in generalisation performance is due to the different sets of assumptions about how the data points relate to one another, called inductive biases, implicitly encoded in the models' architecture. For example, the way in which the architecture of CNNs is organised encourages them to be translation invariant: images with the same pattern in different positions will result in the same output.

A downside of highly specialised models with powerful inductive biases is that they currently require human supervision and domain knowledge to design. In particular, candidate models are usually evaluated with cross-validation until a satisfactory solution is found. As the space of candidate architectures is usually huge, such a process often becomes extremely expensive and time-consuming. Conversely, a recent research direction, pursued by Prof. Van der Wilk and his group, is to use ideas from Bayesian model selection to design training objectives amenable to gradient-based optimisation to simultaneously learn the model's architecture and its parameters. This approach has the potential to significantly streamline the development of task-specific architectures by automating the model design pipeline.

During my PhD, I will further investigate automatic inductive bias selection in Bayesian models. In this context, I will study Gaussian processes and Bayesian neural networks, as they represent flexible models that can be easily made to incorporate a wide array of inductive biases. The project will involve designing model parameterisations and training objectives suitable for this task. At the same time, I will strive to combine automatic inductive bias selection with other desirable features of Bayesian modelling, such as uncertainty quantification. To start, I will explore the inductive biases useful for modelling dynamical systems, with the aim of developing robust and scalable Bayesian models with potential applications ranging from the natural sciences to engineering and finance.

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2740634 Studentship EP/S023151/1 01/10/2022 30/09/2026 Stefano Cortinovis