Reinforcement Learning on the Edge: Specialised Control Policies with Limited Resources

Lead Research Organisation: University of Cambridge
Department Name: Computer Science and Technology

Abstract

In recent years, there has been increasing interest in real-world applications of Reinforcement Learning (RL). One prominent domain is that of mobile robots. Robots include sensors and actuators to observe and act in environments, respectively. Even when such devices are controlled according to a policy learned through RL, they typically must rely on a centralised server to determine their next actions. This significantly increases the robot's latency and makes RL an infeasible solution for real-world deployment. Furthermore, robots may encounter situations which differ from those on which they were trained. This project aims to solve both of these problems. Firstly, it seeks to implement efficient on-device inference for RL models. Secondly, it will investigate new methods of lifelong RL which can dynamically learn from new experiences. Through these lines of investigation, this project seeks to introduce a new paradigm of RL which can leverage the experiences and processing power of multiple simultaneous resource-constrained learners. These learners will be tasked both with evaluating RL policies and with integrating new observations into their models in a resource-efficient way. Learners should be able to share this information with other devices periodically, similar to existing distributed RL approaches. New learning approaches and RL algorithms will need to be developed for this goal
to be realised. It combines work from systems and machine learning in a novel context.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/T517847/1 01/10/2020 30/09/2025
2646067 Studentship EP/T517847/1 04/01/2022 30/06/2025 Carlos Purves