📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

MOA: High Efficiency Deep Learning for Embedded and Mobile Platforms (Full EPSRC Fellowship Submission)

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

In just a few short years, breakthroughs from the field of deep learning have transformed how computers perform a wide-variety of tasks such as recognizing a face, tracking emotions or monitoring physical activities. Unfortunately, the models and algorithms used by deep learning typically exert severe energy, memory and compute demands on local device resources and this conventionally limits their adoption within mobile and embedded devices. Data perception and understanding tasks powered by deep learning are so fundamental to platforms like phones, wearables and home/industrial sensors, that we must reach a point where current -- and future -- innovations in this area can be simply and efficiently integrated within even such resource constrained systems. This research vector will lead directly to outcomes like: brand new types of sensor-based products in the home/workplace, as well as enabling increasing the intelligence within not only consumer devices, but also in fields like medicine (smart stethoscopes) and anonymous systems (robotics/drones).

The MOA fellowship aims to fund basic research, development and eventual commercialization (through collaborations with a series of industry partners) algorithms that aims to enable general support for deep learning techniques on resource-constrained mobile and embedded devices. Primarily, this requires a radical reduction in the resources (viz. energy, memory and computation) consumed by these computational models -- especially at inference (i.e., execution) time. The proposal seeks will have two main thrusts. First, build upon the existing work of the PI in this area towards achieving this goal which includes: sparse intra-model layer representations (resulting in small models), dynamic forms of compression (models that can be squeezed smaller or bigger as needed), and scheduling partitioned model architectures (splitting models and running parts of them on the processor that suits that model fraction best on certain processors found inside a mobile/embedded device). This thrust will re-examine these methods towards solving key remaining issues that would prevent such techniques from being used within products and as part of common practices. Second, investigate a new set of ambitious directions that seek to increase the utilization of emerging purpose-built small-form-factor hardware processor accelerators designed for deep learning algorithms (these accelerators are suitable for use within phones, wearables and drones). However, like any piece of hardware, it is still limited by how it is programmed - and software toolchains that map deep learning models to the accelerator hardware remain infancy. Our preliminary results show that existing approaches to optimizing deep models, conceived first for conventional processors (e.g., DSPs, GPUs, CPUs), poorly use the new capabilities of these hardware accelerators. We will examine the development of important new approaches that modify the representation and inference algorithms used within deep learning so that they can fully utilize the new hardware capabilities. Directions include: mixed precision models and algorithms, low-data movement representations (that can trade memory operations for compute), and enhanced parallelization.

Planned Impact

Expanded discussion upon aspects of how MOA relates to national importance are contained within the Pathways to Impact document. But here summarize in the following paragraphs, the core aspects of importance related to specific EPSRC priority areas, the broader economy, societal impact and contributions to knowledge.
MOA fits within the 'Robotics and artificial intelligence systems' priority area of the UKRI call. It aims to produce technology for enabling machine learning models that otherwise need to run remotely in the cloud, to instead run directly within mobile and embedded systems like robots, drones and small-form-factor devices. As a result, it is an enabling technology useful in the development of new applications or systems. There is direct usage, for example, to enabling healthy/independent living as it would allow a device such as 'Alexa' to be a smarter and better companion to the elderly; or within safety, it can allow a cheap battery-powered workplace or home camera to better understand the semantics of what it captures and react calling the police if it observes dangerous activities. Significantly, the proposed research will likely amplify and extend existing or already on-going machine learning research - this is because it allows a machine learning model that perhaps today must reside in powerful cloud computers to run directly within a drones, robots or devices; this in turn opens up an extend range of new possible application scenarios and use cases for such models.
At a societal level MOA contributes to more ethical, safer and privacy preserving forms of machine learning. Because it enables the use of machine learning directly on limited platforms like phones and devices it reduces the need to transmit and process sensitive data on 3rd party cloud servers not controlled or owned by consumers. Through MOA outcomes, consumers will be able to demand devices (like next-generation medical instruments) that retain their data, yet still offer the benefits of the latest in machine learning.
From the perspective of knowledge, MOA seeks to develop truly innovative concepts in models that can best utilize the latest in commodity and accelerator processor architectures (see workpackage 2). MOA also aims to mature and further invest in studying the latest techniques in a brand-new academic area (efficient deep learning) within a commercial level of quality and rigour (see workpackage 1) that can transfer to offering commercial benefits to companies like Nokia and Samsung that have multiple UK based teams.

Related Projects

Project Reference Relationship Related To Start End Award Value
EP/S001530/1 28/06/2018 03/05/2020 £608,250
EP/S001530/2 Transfer EP/S001530/1 04/05/2020 03/06/2022 £369,604
 
Description We discovered a method to squeeze down in size complicated ML used to recognize speech, so that the model can run directly on a phone. This allows speech to be recognized even when cellular coverage is poor. And recorded speech never needs to leave the phone to support the application. We have also extended this method to include image based ML.
Exploitation Route They are currently being used within industry for building new products.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description Our methods have been adopted in industy for use in products used by real consumers.
First Year Of Impact 2021
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Title CortexML Prototype Libraries and Framework 
Description As an outcome of research funded by MOA, is a software framework we called CortexML that facilitate the training and testing of image/vision based deep neural networks designed for micro-controllers (such as ARM M-series and A-series processors). It is built primarily through the implementation of a variety of quantization routines, along with memory and flash management. We are using this tool to perform a large systematic study of deep learning under micro-controller technology. Tools of this precise type are currently not available to the research community. 
Type Of Material Technology assay or reagent 
Year Produced 2018 
Provided To Others? No  
Impact No significant impact yet. However, the tool is only very recent and used currently within my group. In the near term we will open source this code. 
 
Title TFLite Tools 
Description We released a tool that analyzes memory of machine learning models. This is useful for optimizing them for constrained devices. 
Type Of Material Technology assay or reagent 
Year Produced 2020 
Provided To Others? Yes  
Impact People have used this tool to lower the memory footprint of their models. It is used by some researchers in the community. 
URL https://github.com/eliberis/tflite-tools
 
Title torchquant 
Description Quantization is a popular technique for accelerating and compressing neural networks by utilizing low-bit arithmetic to represent weights and activations. It remains a hot area for research, with continued work on removing the gap in accuracy between full and low precision models. We observe that researchers in this area tend to rely on custom implementations, rather than approaches built into the popular machine learning libraries, as they are not sufficiently flexible to enable research. We are open sourcing TorchQuant, our MIT licensed library that builds upon PyTorch by providing researchers with modular components and implementations that will accelerate their research, and provide the community with consistent baselines. Using our library, we provide an example of how to quickly evaluate a research hypothesis: the "range-precision" trade-off for quantization-aware training. our library can be found at this URL: https://github. com/camlsys/torchquant. 
Type Of Material Improvements to research infrastructure 
Year Produced 2022 
Provided To Others? Yes  
Impact A variety of researchers have used this tool in their research. 
URL https://github.com/camlsys/torchquant
 
Description 2nd International Workshop on Embedded and Mobile Deep Learning 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact In 2018 we ran an international conference that discussed our research aims (and existing solutions) for the MOA project with a variety of international researchers and industry practitioners. At this workshop we also invited them to present their results to share information between parties and build a critical mass of research activity within MOA related topics. Finally, we also invited key international speakers to present a series of three keynotes. This even will be repeated in 2019. It has been instrumental in building activity in this exciting new area.
Year(s) Of Engagement Activity 2018,2019
URL https://www.sigmobile.org/mobisys/2018/workshops/deepmobile18/index.html
 
Description ACM HotMobile 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Schools
Results and Impact Presented on-going work related to the MOA project.
Year(s) Of Engagement Activity 2020
URL http://www.hotmobile.org/2020/
 
Description ACM HotMobile 2021 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact This is a workshop on mobile computing. I attended, and during informal discussions spoke about work conducted under the MOA fellowship.
Year(s) Of Engagement Activity 2021
URL http://www.hotmobile.org/2021/
 
Description ACM MobiCom 2020 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact I gave a keynote talk at MobiCom 2020. This is an international conference and my talk covered aspects of MOA funded research (along with other activities)
Year(s) Of Engagement Activity 2020
URL https://sigmobile.org/mobicom/2020/
 
Description Discussions with Google Brain at Google 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact My group talks to key people at Google Brain responsible for enabling TensorFlow for training deep models to function on embedded and mobile devices. We co-ordinate with them to understand upcoming methods being integrated into the software and most importantly open problems they are activity working on. We inform them of our research results and it has altered their perception as to techniques to adopt and integrate into TensorFlow. The key contact is Pete Warden, and members of his team.
Year(s) Of Engagement Activity 2018,2019
 
Description Discussions with ML Research at ARM 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Discussion of latest trends and results with ML team of ARM (specifically Paul Whatmough based in Boston -- and his team). Primarily this is quarterly. It includes in-person visits and online conference calls.
Year(s) Of Engagement Activity 2018,2019
 
Description Huawei Collaboration Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Gave talk about MOA funded research to a workshop run by Huawei for company employees and a wide variety of other academics also in attendence.
Year(s) Of Engagement Activity 2022
 
Description MLSys 2020 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Schools
Results and Impact We presented our latest results for MOA to the MLSys audience
Year(s) Of Engagement Activity 2020
URL https://mlsys.org/
 
Description Ubicomp 2021 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Attended top-tier academic conference. Presented and discussed work performed under the MOA grant.
Year(s) Of Engagement Activity 2021