Task-general reinforcement learning algorithms

Lead Research Organisation: University of Oxford

Department Name: Computer Science

Abstract

This project falls within the EPSRC Information and communication technologies (ICT) research area. The goal of the project is to develop algorithms capable of extracting task-general structure from data and using that structure to efficiently learn on novel tasks. These algorithms would enable deployment of reinforcement learning agents in the real world, where they could acquire new skills based on little experience. This data efficient skill acquisition could lead to more economical automation of industrial processes and household routines. While task-generalization in reinforcement learning is not a new research topic, effectively tackling it requires zooming out from the single-task perspective that has been prevalent in the field recently. The first steps away from that perspective are taken by considering generalization performance to novel tasks a key measure of success. This is adopted as a central evaluation criterion for the algorithms developed in this project. The evaluation will consist of gathering both empirical and theoretical support for the algorithms. The outcomes of the development and evaluation are published in the top conferences and academic journals of the artificial intelligence and machine learning communities.

One way to improve task-generalization in reinforcement learning is to consider a higher-level learning problem, called meta-learning, which aims to learn the learning algorithm itself with the explicit objective of fast learning on novel tasks. Meta-learning is a promising tool for task-generalization since it enables leveraging the strength of deep learning when abundant data is available by turning the problem of generalization also into learning. While meta-learning has seen a surge of interest and many exciting contributions in the past few years, the generalization performance of these approaches to genuinely novel tasks outside the training task distributions has garnered only limited attention. This lack of attention serves as a signpost guiding this research project into the relatively unexplored territory of tackling the questions of task-generalization explicitly.

Concretely in this project, new algorithms and training environments are developed for task-general reinforcement learning. To develop new algorithms, novel meta-parameterizations of reinforcement learning agents and the algorithms themselves will be considered. The generalization performance of reinforcement learning agents does not only rely on the training algorithm but on the training environments and datasets as well. Therefore, new generalization-focused training environments will have to be developed.

Student:

Risto Vuorio

Period of Study:

Oct 20 - Sep 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2426703

Research Topic:

Unclassified

Organisations

University of Oxford (Lead Research Organisation)

People	ORCID iD
Shimon Whiteson (Primary Supervisor)
Risto Vuorio (Student)

Publications

Author Name Title

Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/R513295/1			01/10/2018	30/09/2023
2426703	Studentship	EP/R513295/1	01/10/2020	30/09/2023	Risto Vuorio

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects