Multi-Agent Reinforcement Learning for Game AI and Robotic Control

Lead Research Organisation: University of York
Department Name: Computer Science


The Internet of Things and connected devices is creating opportunities for the increased automation of many complex tasks. From self-driving cars and unmaned aerial vehicles to high performing AIs in increasingly complex multi-player game environments. In these scenarios, it is easy to specify a goal for the AIs to achieve, but it is much more difficult to define the strategy that the AI should follow in order to achieve the environment's goals. In these multi-agent environments, training becomes difficult because it is necessary to discover many varied strategies for all agents present in the environment. Classical approaches of training an AI to perform well on a environment with multiple agents suffers from the problem of overfitting. Where learning agents become good operating with or against themselves, but considerably drop performance when matched against other agents that act differently to those they have previously encountered. This issue is specially disastrous when one knows not of any existing good opponent strategies to test against. In the absence of a dataset of already existing agent strategies, Multi Agent Reinforcement learning offers the posibility of safely learning a good behaviour in these kind of environments by trial an error through simulation. Making it possible to learn how to solve these problems by training from the experiences encountered in the simulated environments.

The focus of the PhD research is to address an open question in Multi Agent Reinforcement Learning that helps to mitigate the overfitting problem. How can we qualitatively analyze the way that different agent strategies in an environment influence the eventual strategy of a learning agent. Using Reinforcement learning techniques, it is possible to create a dataset of varied agent strategies starting without any knowledge of what a good strategy is in a given environment. Once we have a dataset of strategies, it is possible to create an algorithm that will switch??? agent strategies to present to a learning agent to maximize its performance and robustness in the environment. Making it possible to create an AI that not only performs well in a given environment, but is also robust against unseen agent strategies.

The intended aim of Daniel's project is to develop Multi Agent Reinforcement Learning algorithms for learning strategies to control swarms of agents, which are capable of outperforming multiple possible opponent strategies for a specific task. This will also enable the system to control vast numbers of connected AIs simultaneously. The project will first require a system to be built that allows for the creation and testing of Multi Agent Reinforcement Learning algorithms. This system will be built on top of popular open source artificial intelligence frameworks, such as Keras and Tensorflow. Due to the computational complexity of modern Reinforcement Learning techniques, these tasks require programming proficiency in various programming languages, in addition to having a practical understanding on how to test, train and deploy AIs in highly parallelizable computing clusters. The research outcomes of the project will be tested on the platform provided by Accelerated Dynamics, a London based robotic start-up. Once the algorithms have been developed and thoroughly tested through simulation, they will have the opportunity to be tested in multiple real world scenarios using swarms of aerial drones. Thereby promoting and encouraging the sharing of efforts, insights, resources and research outcomes between industry and academia.


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R512230/1 01/10/2017 30/09/2021
1946111 Studentship EP/R512230/1 01/10/2017 30/09/2021 Daniel Beeston Hernandez