Task-general reinforcement learning algorithms

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

This project falls within the EPSRC Information and communication technologies (ICT) research area. The goal of the project is to develop algorithms capable of extracting task-general structure from data and using that structure to efficiently learn on novel tasks. These algorithms would enable deployment of reinforcement learning agents in the real world, where they could acquire new skills based on little experience. This data efficient skill acquisition could lead to more economical automation of industrial processes and household routines. While task-generalization in reinforcement learning is not a new research topic, effectively tackling it requires zooming out from the single-task perspective that has been prevalent in the field recently. The first steps away from that perspective are taken by considering generalization performance to novel tasks a key measure of success. This is adopted as a central evaluation criterion for the algorithms developed in this project. The evaluation will consist of gathering both empirical and theoretical support for the algorithms. The outcomes of the development and evaluation are published in the top conferences and academic journals of the artificial intelligence and machine learning communities.

One way to improve task-generalization in reinforcement learning is to consider a higher-level learning problem, called meta-learning, which aims to learn the learning algorithm itself with the explicit objective of fast learning on novel tasks. Meta-learning is a promising tool for task-generalization since it enables leveraging the strength of deep learning when abundant data is available by turning the problem of generalization also into learning. While meta-learning has seen a surge of interest and many exciting contributions in the past few years, the generalization performance of these approaches to genuinely novel tasks outside the training task distributions has garnered only limited attention. This lack of attention serves as a signpost guiding this research project into the relatively unexplored territory of tackling the questions of task-generalization explicitly.

Concretely in this project, new algorithms and training environments are developed for task-general reinforcement learning. To develop new algorithms, novel meta-parameterizations of reinforcement learning agents and the algorithms themselves will be considered. The generalization performance of reinforcement learning agents does not only rely on the training algorithm but on the training environments and datasets as well. Therefore, new generalization-focused training environments will have to be developed.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R513295/1 01/10/2018 30/09/2023
2426703 Studentship EP/R513295/1 01/10/2020 30/09/2023 Risto Vuorio