Reinforcement Learning on the Edge: Specialised Control Policies with Limited Resources

Lead Research Organisation: University of Cambridge

Department Name: Computer Science and Technology

Abstract

In recent years, there has been increasing interest in real-world applications of Reinforcement Learning (RL). One prominent domain is that of mobile robots. Robots include sensors and actuators to observe and act in environments, respectively. Even when such devices are controlled according to a policy learned through RL, they typically must rely on a centralised server to determine their next actions. This significantly increases the robot's latency and makes RL an infeasible solution for real-world deployment. Furthermore, robots may encounter situations which differ from those on which they were trained. This project aims to solve both of these problems. Firstly, it seeks to implement efficient on-device inference for RL models. Secondly, it will investigate new methods of lifelong RL which can dynamically learn from new experiences. Through these lines of investigation, this project seeks to introduce a new paradigm of RL which can leverage the experiences and processing power of multiple simultaneous resource-constrained learners. These learners will be tasked both with evaluating RL policies and with integrating new observations into their models in a resource-efficient way. Learners should be able to share this information with other devices periodically, similar to existing distributed RL approaches. New learning approaches and RL algorithms will need to be developed for this goal
to be realised. It combines work from systems and machine learning in a novel context.

Student:

Carlos Purves

Period of Study:

Jan 22 - Jun 25

Funder:

EPSRC

Project Status:

Active

Project Category:

Studentship

Project Reference:

2646067

Research Topic:

Unclassified

Organisations

University of Cambridge (Lead Research Organisation)

People	ORCID iD
Pietro Lio (Primary Supervisor)
Carlos Purves (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/T517847/1			01/10/2020	30/09/2025
2646067	Studentship	EP/T517847/1	04/01/2022	30/06/2025	Carlos Purves

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects