📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Self-supervised autoencoder framework for salient sensory feature extraction

Lead Research Organisation: Imperial College London
Department Name: Electrical and Electronic Engineering

Abstract

The natural world is full of noise, but the brain's capacity for information transmission is limited. Therefore, discarding irrelevant information from sensory inputs is key. Studies suggest that this could be partly achieved by the brain implementing information bottlenecks. To implement them, we need to measure information, which is often achieved with mutual information. However, this metric and many existing approaches for its estimation suffer from the curse of dimensionality. Recently, this challenge has been approached by framing mutual information estimation as a minmax optimisation problem in an adversarial setting, which has been found to be scalable in dimension and sample-size.

Building on this, we propose an adversarial-inspired autoencoder framework for salient sensory feature extraction consisting of three neural networks: encoder, decoder, and classifier. The objective is for the encoder to learn salient features necessary for classification, but not reconstruction. The auxiliary classification task helps condition the latent space of the encoder to capture salient features. Preliminary results on MNIST and CIFAR10 show that it discards irrelevant information from image data. Furthermore, it appears to perform figure-ground separation.

In this project, we aim to confirm these findings by training the framework on more complex datasets, investigate whether it can reproduce other feature selective mechanisms, and extend its application to speech processing. For the latter, we will begin by training the network on a simple task, such as vowel classification. As our approach relies on representations obtained directly from input data, the framework may provide a more precise explanation of neural responses to stimuli. This may lead to the generation of new, testable hypotheses about the features underlying noise-robust speech processing, as well as contribute to speech recognition system improvement.

People

ORCID iD

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/W524323/1 30/09/2022 29/09/2028
2894189 Studentship EP/W524323/1 29/09/2023 30/03/2027