Efficient deep learning in resource constrained environments

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

This project falls within the EPSRC Information and Communication (ICT) research area.

The project "Efficient deep learning in resource constrained environments" aims to make further advances in accelerating and reducing memory footprint of machine learning (ML) models and, in particular, deep neural networks. The slowness of neural networks, even when run on expensive hardware, is commonly raised as an issue in both academia and industry. ML practitioners would like to deploy powerful models to low-powered hardware, such as mobile devices, and ML researchers would like to iterate quickly during model development, making it more interactive and productive.

An ML system typically has the following components:
(a) the hardware/software implementation which defines how the target device runs the model,
(b) the model which defines how input data is processed (often depends on the structure of the data), and
(c) the algorithm (loss function optimizer) which dictates how the model is trained on the input data.
These components are largely seen as independent, enabling model developers to mix and match different options for their problem. As a result, existing research in model performance optimization tends to focus on each part in isolation.

The aim of this project is to develop methods that extract further speed and memory gains by bypassing boundaries between said components, and applying optimisations throughout the entire ML system in a joint and mutually-informed way. For example, current software toolkits and hardware accelerators for deep neural networks often make little or no assumptions about the overall architecture of the neural network being ran or the structure of the data it is processing. Similarly, model architects tend to make little or no assumptions about the underlying hardware, missing out on potential speed or memory gains.

Concretely, the objective of the project is to develop neural network implementation, design and training methodology that make the best use of capabilities and constraints of the target device as well as the structure of the data for the problem at hand. This would be of great use for industrial applications and fields of science which have to deal with vast quantities of data under time pressure or operate in environments restricted hardware, such as genomics, high energy physics, financial data analysis, etc.

One of more ambitious research outcomes may also be a machine learning algorithm to discover such optimisations automatically when provided with a sample of the data and hardware constraints.

The project aligns with the Artificial Intelligence EPSRC ICT research direction; no industrial collaborations are planned.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509711/1 01/10/2016 30/09/2021
2053098 Studentship EP/N509711/1 01/10/2018 30/09/2020 Edgaras Liberis