Artificial intelligence technologies, Image and vision computing, Digital signal processing

Lead Research Organisation: Imperial College London

Department Name: Electrical and Electronic Engineering

Abstract

Autoencoders are neural networks popular in many areas of Machine Learning. They are built of two main parts - encoder and decoder. The former transforms input data into a vector representation and the latter recovers the data from it. The main objective of the autoencoders is to minimize difference between the input and the recovered data. Equally important is the latent representation, which is a vector output by the encoder. This vector contains essential, low-dimensional information needed to reconstruct the data, therefore autoencoders are often used for data compression. However, there are many applications, where recovering the original data is not necessary such as certain Computer Vision tasks, including image classification or retrieval, where only the image representation is needed to enable accurate prediction from the image.

The focus of the proposed research is on image retrieval under extreme constraints imposed by the application scenario e.g. communication channel with noise where only a small amount of data can be transmitted in a limited time. Deep learning (based on deep neural networks) methods will be investigated to perform these tasks, and the constraints as well as a noise model will be incorporated into the system. Two specific application scenarios will be considered: person re-identification and face recognition CCTV in surveillance tasks where large volumes of data need to be accessed efficiently, hence the memory and noise constraints. Face recognition is a similar task to person re-identification, but the type of details needed to be encoded for successful identification differ.

Various approaches will be tested to perform these tasks. As a baseline, conventional image compression methods will be used to meet channel bandwidth limitations and retrieval tasks will be performed on the recovered (decompressed) images. Subsequently, compression algorithms will be replaced by autoencoder, where latent representation's transmission will be simulated, and images reconstructed by the decoder will be fed into the retrieval network. Alternative approach will consist of applying the representation encoders to the original images and simulating transmission of the resulting feature vector. The effectiveness of the encoding approach will be assessed by the performance of the identification methods and the compactness of the encoded representation while subjected to various amounts of noise.

The main objective of this research will be to study the relation between noise, retrieval performance, and representation compression for deep learning autoencoder models and compare it to the alternative approach based on sending feature vectors only. The problem of encoding specific and task-oriented image representation under noisy communication channel has not been investigated yet. Autoencoders that incorporate noisy communication channel models into their designs are relatively new and typically evaluated with standard metrics, such as MSE and PSNR (Pulse Signal to Noise Ratio), whereas this research will consider task-oriented thus more reliable metrics, namely retrieval performances.

This research is relevant to numerous surveillance scenarios, where it is necessary to transmit large amounts of data via wireless communication channel within a limited time. An example of such application is using a drone equipped with a video camera and searching for a suspect in a large crowd. The constraints introduced by this application such as the use of low power transmitters making the transmitted signal vulnerable to noise, and high compression level of the transmitted information pose new challenges on the methods typically proposed for the human identification tasks. They require new methods of encoding image representations that are robust to noise and task-oriented for retaining only the task relevant information.

Student:

Mikolaj Jankowski

Period of Study:

Sep 18 - Sep 22

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2124799

Research Topic:

Unclassified

Organisations

Imperial College London (Lead Research Organisation)

People	ORCID iD
Krystian Mikolajczyk (Primary Supervisor)
Mikolaj Jankowski (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/R513052/1			30/09/2018	29/09/2023
2124799	Studentship	EP/R513052/1	30/09/2018	29/09/2022	Mikolaj Jankowski

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects