Efficient deep learning in constrained computing platforms

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

This project falls within the EPSRC Artiftcial Intelligence Technologies research area and is titled "Efficient deep learning in constrained computing platforms".

Since the success of deep neural networks in competitions such as ImageNet, a large part of machine learning research has been aimed at improving accuracy of deep neural nets. This has resulted in sophisticated models that perform very well in tasks such as computer vision, speech recognition and natural language processing.
Simultaneously, advances in the computing power of hardware platforms are allowing more complex algorithms to run on constrained platforms such as mobile and wearable devices. While these deep learning computations are more suited to run on GPUs due to their parallel nature, many leading hardware platforms already have released or will soon release hardware accelerators specifically designed for machine learning tasks.

These two powerful trends provide the potential for broad adaptation of on-device deep learning. However, little attention has been paid to practical requirements of deploying these accurate models to constrained real-time hardware platforms. In addition to model requirements in terms of inference time, these platforms have stringent requirements in terms of power consumption, memory footprint, latency, parallelisation, and cycle budget.

This project will seek to address some of these fundamental challenges that prevent deep neural nets from being widely adopted on mobile computing platforms. Within this context, our aim and objectives include, but are not limited to:
-Finding novel solutions to decompose existing deep learning models at runtime in order to make them execute more efficiently on hardware targets. Ideally these solutions should work automatically with any underlying model and should not require users to have knowledge of the model for manual tuning.
-Developing new deep learning algorithms designed from ground-up to run efficiently on constrained classes of computing platforms.
-Studying the effects of compression and quantisation on deep learning and ways to exploit the sparsity in neural nets.
-Finding new models and optimisations that can utilise machine learning hardware acceleration blocks.

This project relates to EPSRC's strategies focus on making machine learning more robust, resilient and transferable. It can potentially contribute to other domains such as human-computer interaction and social sciences by facilitating use of deep learning principles and algorithms towards a richer and more reliable understanding of user behaviour and context using mobile systems with sensory perception, reasoning and interaction capabilities.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
NE/W502728/1 01/04/2021 31/03/2022
1894770 Studentship NE/W502728/1 01/10/2017 31/03/2022 Milad Alizadeh