Variational Representation Learning

Lead Research Organisation: University of Oxford

Department Name: Computer Science

Abstract

This project falls within the EPSRC Artificial Intelligence Technologies research area. Representation learning is defined as learning to map data to a different representation that has some preferential properties. These properties include a lower dimensionality, disentanglement, and meaning arithmetic. In turn these representations could enable generative modelling, and semi- supervised or unsupervised learning.

In practice, data is often presented in an inconvenient form, such as a graph, a high resolution MRI scan or a piece of text. To be able to process this data efficiently and correctly, it would be preferable to work with a different representation. This research project is aimed at converting the original data representation into a more convenient one, while preserving meaningful properties of the data. This representation may be a lower dimensional form of the same data type, or have some preferable properties such as removal of personally identifiable information.

There are currently several methods to obtain these more convenient representations, for example latent variable models [1] and metric learning [2]. Each of these methods has some downsides, such as no guarantees on the information contained in the representation or only optimizing a surrogate loss. In this project we aim to introduce novel, easy to use methods for creating representations with guarantees on information contained in the new representation. These representations can be used with confidence in other applications, such as medical imaging, language translation and compression. We plan to publish these methods for various types of data in various settings at top-tier machine learning conferences. We aim to develop algorithms and methods that improve upon the current solutions by optimizing for the true objective and allowing for quantifying the quality of the new found representation. This will be done by leveraging and developing variational algorithms that allow us to learn new types of models, probably with the use of deep neural networks. The project is tightly aligned with the research area Artificial Intelligence Technologies (AIT). Representation learning is currently in use by a wide variety of AIT, such as medical imaging, computer vision and reinforcement learning. Improvements in learning representations will therefore enable
better autonomous technologies, improve our ability to make fair and correct medical diagnoses, and enable privacy-minded applications built on top of data gathered by Internet of Things devices.

The project will be supervised by Professor Yarin Gal from the Department of Computer Science and co-supervised by Professor Yee Whye Teh from the Department of Statistics, both at the University of Oxford. Aside from the EPSRC grant, the project will be funded by the
Oxford-Google DeepMind fellowship.
[1] Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." ICLR, 2014.
[2] Hoffer, Elad, and Nir Ailon. "Deep metric learning using triplet network." International Workshop on
Similarity-Based Pattern Recognition . Springer, Cham, 2015.

Student:

Joost Van Amersfoort

Period of Study:

Sep 18 - Sep 22

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2056659

Research Topic:

Unclassified

Organisations

University of Oxford (Lead Research Organisation)

People	ORCID iD
Joost Van Amersfoort (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Alizadeh M (2022) Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

Jesson A (2021) Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data

Kirsch A (2019) BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning

Van Amersfoort J (2020) Uncertainty Estimation Using a Single Deep Deterministic Neural Network

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/N509711/1			30/09/2016	29/09/2021
2056659	Studentship	EP/N509711/1	30/09/2018	29/09/2022	Joost Van Amersfoort

Key Findings


Description	Machine Learning is being used in an increasing number of situation to make automated decisions. Examples of these situations are medical diagnostics, self-driving cars, and fraud detection. In order to be able to trust a system that uses Machine Learning to make decisions in an automated fashion, the system needs to be able to signal when it's uncertain about what to do. For example if it encounters a new road situation in the case of self driving cars, or the X-ray image of a very rare disease. In the work funded through this award, we analysed deep learning systems (a branch of machine learning with promising results) to see if they are able to signal their uncertainty in a reliable way. We found that by default machine learning models are not able to quantify their uncertainty, which hampers their adoption in practice. In one of the achievements in our work, we defined the properties a deep neural network should have in order to be able to express its uncertainty. We then came up with a way to implement these properties and obtained a model that outperformed the standard setup with several promising future directions opening up as well. Using this alternative model will enable more robust and trustworthy deployment of machine learning models. Additionally, we looked at improving model performance by selectively labeling only the most interesting data points. Generally, obtaining a labelled data set is the most expensive part of the machine learning pipeline. By only labeling the most informative data points, it becomes significantly cheaper to obtain a well performing model. There is significant overlap with the uncertainty quantification discussed in the previous paragraph: data points on which the model is most uncertain are generally also the data points that the model would learn the most from if they were labelled. In our work, we improve the ability to select a batch of informative data points. Labeling a batch of points at once is generally preferable, when for example an appointment needs to be made with an expert for every labeling moment.
Exploitation Route	1. For any situation that can be automated, such as fault detection in engineering, it is possible to use our work to obtain a well performing model with significantly fewer labelled data points than before. Future work on the academic side includes further scaling up to larger batches and picking even more informative points. On the non-academic side, it is now reasonable to consider using machine learning even when only a small budget is available. This lowers the barrier to entry and enables companies to work more efficiently in the future. 2. When a model is deployed for a particular task, it's possible to use our model to detect problems as the model is being used. An alert could be generated when there is a significant amount of noise coming from the sensor due to rain or snow, or a particular novel situation that was not incorporated in the original data set. This alert will increase the trust of the company in the model's ability to avoid mistakes. Academically, the current approach can still be improved in a number of ways such as speed, ease of use, and uncertainty reliability.
Sectors	Digital/Communication/Information Technologies (including Software) Healthcare Pharmaceuticals and Medical Biotechnology

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects