Symmetries and Invariances in Deep and Probabilistic Learning

Lead Research Organisation: University of Oxford

Abstract

Many problems in the real world exhibit a natural set of symmetries or in-variances. Examples of this include rotated images still being the same image, objects view from different angles being the same object, or the natural equiv-ariance principles that pervade modern physics. It is natural then to consider that in Machine Learning and Statistics the models we build and learn about the world should follow these symmetries.
Recent work in Deep Learning has produced models with huge amounts of flexibility and methods to fit these to data. This great flexibility in these models however can also present with a number of downsides. In particular when applying them to these problems with symmetries, they cannot learn to obey these symmetries. Recent work has shown that by building models that obey these symmetries has huge benefits. These models can achieve better accuracy on the tasks they are set, require smaller models to do so, and significantly less data to train on. If the process of training these models or acquiring the data needed is either expensive or environmentally impacting, the benefit of using these models is clear. An additional benefit of building in these symmetries is the guarantee that the model wil behave the same way when it is presented data that has been transformed in some way. This can have considerable implications for the safety guarantees of these models. A particular area of interest for application of these methods is in Machine Learning in the Natural Sciences, where they can be applied to help accelerate the scientific process.
The aims of this project are to continue the work in the literature that has begun on working with symmetries in Deep Learning models. While substantial work has been done so far, significantly mode exists to be done. Additionally, the project aims to look at symmetries in Probabilistic Modelling as well. The work in this area is much less mature, but can benefit in similar ways that Deep Learning has from considering symmetries. A final aim of the project is to look at the process of discovering symmetries in data. Most of the work done so far has focused on taking a particular know symmetry of a data set and constructing a model to follow this. In many scenarios however we may not know the symmetries in data a priori. In these situations we would want to be able to interrogate the data in a suitable way to discover what symmetries, or partial symmetries, it displays.
This project falls within the EPSRC Engineering research area.
This project involves partial collaboration with Qualcomm AI Research

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2247701 Studentship EP/S023151/1 01/10/2019 30/09/2023 Michael Hutchinson