Variational Representation Learning

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

This project falls within the EPSRC Artificial Intelligence Technologies research area. Representation learning is defined as learning to map data to a different representation that has some preferential properties. These properties include a lower dimensionality, disentanglement, and meaning arithmetic. In turn these representations could enable generative modelling, and semi- supervised or unsupervised learning.

In practice, data is often presented in an inconvenient form, such as a graph, a high resolution MRI scan or a piece of text. To be able to process this data efficiently and correctly, it would be preferable to work with a different representation. This research project is aimed at converting the original data representation into a more convenient one, while preserving meaningful properties of the data. This representation may be a lower dimensional form of the same data type, or have some preferable properties such as removal of personally identifiable information.

There are currently several methods to obtain these more convenient representations, for example latent variable models [1] and metric learning [2]. Each of these methods has some downsides, such as no guarantees on the information contained in the representation or only optimizing a surrogate loss. In this project we aim to introduce novel, easy to use methods for creating representations with guarantees on information contained in the new representation. These representations can be used with confidence in other applications, such as medical imaging, language translation and compression. We plan to publish these methods for various types of data in various settings at top-tier machine learning conferences. We aim to develop algorithms and methods that improve upon the current solutions by optimizing for the true objective and allowing for quantifying the quality of the new found representation. This will be done by leveraging and developing variational algorithms that allow us to learn new types of models, probably with the use of deep neural networks. The project is tightly aligned with the research area Artificial Intelligence Technologies (AIT). Representation learning is currently in use by a wide variety of AIT, such as medical imaging, computer vision and reinforcement learning. Improvements in learning representations will therefore enable
better autonomous technologies, improve our ability to make fair and correct medical diagnoses, and enable privacy-minded applications built on top of data gathered by Internet of Things devices.

The project will be supervised by Professor Yarin Gal from the Department of Computer Science and co-supervised by Professor Yee Whye Teh from the Department of Statistics, both at the University of Oxford. Aside from the EPSRC grant, the project will be funded by the
Oxford-Google DeepMind fellowship.
[1] Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." ICLR, 2014.
[2] Hoffer, Elad, and Nir Ailon. "Deep metric learning using triplet network." International Workshop on
Similarity-Based Pattern Recognition . Springer, Cham, 2015.

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509711/1 01/10/2016 30/09/2021
2056659 Studentship EP/N509711/1 01/10/2018 30/09/2022 Joost Van Amersfoort
 
Description Machine Learning is being used in an increasing number of situation to make automated decisions. Examples of these situations are medical diagnostics, self-driving cars, and fraud detection. In order to be able to trust a system that uses Machine Learning to make decisions in an automated fashion, the system needs to be able to signal when it's uncertain about what to do. For example if it encounters a new road situation in the case of self driving cars, or the X-ray image of a very rare disease. In the work funded through this award, we analysed deep learning systems (a branch of machine learning with promising results) to see if they are able to signal their uncertainty in a reliable way.

We found that by default machine learning models are not able to quantify their uncertainty, which hampers their adoption in practice. In one of the achievements in our work, we defined the properties a deep neural network should have in order to be able to express its uncertainty. We then came up with a way to implement these properties and obtained a model that outperformed the standard setup with several promising future directions opening up as well. Using this alternative model will enable more robust and trustworthy deployment of machine learning models.

Additionally, we looked at improving model performance by selectively labeling only the most interesting data points. Generally, obtaining a labelled data set is the most expensive part of the machine learning pipeline. By only labeling the most informative data points, it becomes significantly cheaper to obtain a well performing model. There is significant overlap with the uncertainty quantification discussed in the previous paragraph: data points on which the model is most uncertain are generally also the data points that the model would learn the most from if they were labelled. In our work, we improve the ability to select a batch of informative data points. Labeling a batch of points at once is generally preferable, when for example an appointment needs to be made with an expert for every labeling moment.
Exploitation Route 1. For any situation that can be automated, such as fault detection in engineering, it is possible to use our work to obtain a well performing model with significantly fewer labelled data points than before. Future work on the academic side includes further scaling up to larger batches and picking even more informative points. On the non-academic side, it is now reasonable to consider using machine learning even when only a small budget is available. This lowers the barrier to entry and enables companies to work more efficiently in the future.

2. When a model is deployed for a particular task, it's possible to use our model to detect problems as the model is being used. An alert could be generated when there is a significant amount of noise coming from the sensor due to rain or snow, or a particular novel situation that was not incorporated in the original data set. This alert will increase the trust of the company in the model's ability to avoid mistakes. Academically, the current approach can still be improved in a number of ways such as speed, ease of use, and uncertainty reliability.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology