Abstraction and Generalisation in Reinforcement Learning

Lead Research Organisation: University College London

Department Name: Computer Science

Abstract

I am studying how tasks can be represented using abstract state spaces, and how these state spaces can be inferred from experience. If you were to describe to someone how to order a coffee at a cafe, you might say something like "walk through the door, go to the counter, look at the menu and decide what you want, tell the barista your order and then wait for your coffee to be served". In a reinforcement learning setting, that set of instructions defines a set of transitions between states - that is, the state of being outside the cafe, to the state of being inside the cafe, to the state of being at the counter, and so on. This state space however is not at the same granularity as the actions you take in the world - when you walk though the door, you are not considering every single action you take at the level of muscle twitches, or even at the level of swinging your legs. Reinforcement learning agents are constrained to the most granular action defined in their environment, which by analogy would be muscle twitches in this instance. State abstraction is the process by which an agent would discover the high-level states described before, by composing its granular actions hierarchically into high-order skills.

I am currently using reinforcement learning methods centred around the successor representation - a way of representation values as the conjunction of state occupancies and reward. This allows us to probe the structure of the state spaces without the confound of reward we would have by looking at the value structure. I am currently building abstract state spaces using symmetric compression, and through temporal abstraction

Student:

Matthew Sargent

Period of Study:

Oct 19 - Jul 24

Funder:

EPSRC

Project Status:

Active

Project Category:

Studentship

Project Reference:

2281998

Research Topic:

Unclassified

Organisations

University College London (Lead Research Organisation)

People	ORCID iD
Peter Bentley (Primary Supervisor)
Matthew Sargent (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/R513143/1			01/10/2018	30/09/2023
2281998	Studentship	EP/R513143/1	01/10/2019	10/07/2024	Matthew Sargent

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects