Near-Optimal Scalable Algorithms for Multi-Agent Reinforcement Learning
Lead Research Organisation:
University of Oxford
Department Name: Statistics
Abstract
Sequential decision making is an important setting of modern statistical theories and applications, where an agent sequentially interacts with an environment - observing its state, taking actions and receiving rewards - with the objective of maximizing the cumulative reward.
This class of methods has been successfully applied to games, finance, robotics, autonomous driving and computer vision. Many of these applications involve the participation of multiple agents, which brings a scalability issue with respect to the state and action spaces. Indeed, while many of the already established algorithms achieve good scaling guarantees for the single agent setting, they do not perform well if they are immediately applied to the case where multiple agents interact in the same environment. This is because the simplest approach would be to consider a state or an action as the joint states or actions of all the agents, causing the spaces to grow exponentially with respect to the number of agents. Our approach consists in dividing the set of agents into neighbourhoods in order to reduce the exponential scaling of the setting.
The aim of the project is to obtain a near-optimal algorithm for the multi-agent setting with a computational complexity that depends exponentially only on the cardinality of a neighbourhood.
This project falls within the EPSRC "Statistics an applied probability" and "Artificial intelligence technologies" research areas.
This class of methods has been successfully applied to games, finance, robotics, autonomous driving and computer vision. Many of these applications involve the participation of multiple agents, which brings a scalability issue with respect to the state and action spaces. Indeed, while many of the already established algorithms achieve good scaling guarantees for the single agent setting, they do not perform well if they are immediately applied to the case where multiple agents interact in the same environment. This is because the simplest approach would be to consider a state or an action as the joint states or actions of all the agents, causing the spaces to grow exponentially with respect to the number of agents. Our approach consists in dividing the set of agents into neighbourhoods in order to reduce the exponential scaling of the setting.
The aim of the project is to obtain a near-optimal algorithm for the multi-agent setting with a computational complexity that depends exponentially only on the cardinality of a neighbourhood.
This project falls within the EPSRC "Statistics an applied probability" and "Artificial intelligence technologies" research areas.
Organisations
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/V520202/1 | 30/09/2020 | 31/10/2025 | |||
2444539 | Studentship | EP/V520202/1 | 30/09/2020 | 29/09/2024 | Carlo Alfano |