REcoVER: Learning algorithms for REsilient and VErsatile Robots

Lead Research Organisation: Imperial College London

Department Name: Computing

Abstract

Robots have the potential to deliver tremendous benefits to our society by assisting us in all aspects of our everyday life. For example, they could increase the quality of life of elderly people by allowing them to stay longer at home on their own, through preparing meals, cleaning the house, and assisting them to get dressed. However, robots such as legged robots are also very complex machines, which are highly prone to damage when they are not operating in the well-controlled environments of factories. Moreover, because of this complexity and the large variety of environments they might encounter, it impossible for engineers to anticipate all the damage situations that the robot may encounter and to program its reactions accordingly.

A promising approach to overcome this difficulty is to enable robots to learn on their own how to face and how to respond to the different situations they encounter. This approach shares similarities with the way humans and animals react in analogous circumstances. For instance, a child with a sprained ankle learns on his own how to walk with only one foot in order to minimise the pain. The objective of this research project is to develop the algorithmic foundations that allow robots to do the same. In previous works, we have developed creative learning algorithms that enable (physical) legged robots to overcome the loss of a leg by learning how to walk forward in less than two minutes. However, in these works, the algorithms were configured to solve a single task (i.e., walking forward), which does not leverage the versatility of legged robots and their capability, for instance, to walk in every direction, to jump, and to crawl.

The ambition of this project is to extend the adaptation capabilities of our algorithms to the entire range of the robots' abilities. This will be achieved by employing recent advances in hierarchical reinforcement learning to transfer knowledge during the adaptation process across the different skills of the robots. The combination of these hierarchical skill repertoires with our online-adaptation algorithms will enable robots to quickly transfer the result of their adaptation on one skill to the other skills. For instance, after finding a new way to walk forward, a robot might have discovered that it cannot rely on its front-left leg. With the proposed project, this information will be automatically used by the robot to speed-up the adaptation process when it will try, for instance, to learn to turn by avoiding to use the front-left leg too. In addition to damage recovery, the same algorithm will enable robots to adapt from changes in their environment, for instance by changing their behaviours depending on whether they walk on flat concrete floor or on sloping grassy ground.

Increasing the adaptation capabilities of versatile robots aims in the long term to enable the use of robots to substitutes humans in the most dangerous task they have to perform. For instance, thanks to robots with improved adaptation abilities, it would be possible to send robots searching for survivors after an earthquake or to operate in a nuclear plant after a disaster. Improving the ability of robots to overcome unknown situations is one of the key requirements to enable them to be a significant part of our daily life.

This research will be undertaken at Imperial College London, in the department of computing. The project will benefit from state of the art robotic facilities, including a quadruped robot, a hexapod robot and a motion capture system, to develop and experiment a new generation of learning algorithms for resilient robots.

Funded Value:

£285,285

Funded Period:

Mar 21 - Feb 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/V006673/1

Principal Investigator:

Antoine Cully

Research Subject:

Info. & commun. Technol. (80%)

Mechanical engineering (20%)

Research Topic:

Artificial Intelligence (80%)

Robotics & Autonomy (20%)

Organisations

People	ORCID iD
Antoine Cully (Principal Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Allard M (2022) Hierarchical quality-diversity for online damage recovery

Allard M (2023) Online Damage Recovery for Physical Robots with Hierarchical Quality-Diversity in ACM Transactions on Evolutionary Learning and Optimization

Cully A (2021) Multi-emitter MAP-elites

Cully A (2021) Quality-diversity optimisation

Cully A (2022) Quality-diversity optimisation

Flageat M (2023) Empirical analysis of PGA-MAP-Elites for Neuroevolution in Uncertain Domains in ACM Transactions on Evolutionary Learning and Optimization

Flageat M (2023) Uncertain Quality-Diversity: Evaluation methodology and new methods for Quality-Diversity in Uncertain Domains in IEEE Transactions on Evolutionary Computation

Grillotti L (2023) Kheperax: a Lightweight JAX-based Robot Control Environment for Benchmarking Quality-Diversity Algorithms

Grillotti L (2023) Don't Bet on Luck Alone: Enhancing Behavioral Reproducibility of Quality-Diversity Solutions in Uncertain Domains

Grillotti L (2022) Relevance-guided unsupervised discovery of abilities with quality-diversity algorithms

Further Funding
Software and Technical Products


Description	Learning Introspective Control
Amount	$1,600,000 (USD)
Organisation	Defense Advanced Research Projects Agency (DARPA)
Sector	Public
Country	United States
Start	11/2022
End	11/2025


Title	QDax: Accelerated Quality-Diversity
Description	QDax is a tool to accelerate Quality-Diversity (QD) and neuro-evolution algorithms through hardware accelerators and massive parallelization. QD algorithms usually take days/weeks to run on large CPU clusters. With QDax, QD algorithms can now be run in minutes! QDax has been developed as a research framework: it is flexible and easy to extend and build on and can be used for any problem setting. My entire research group is now using this tool everyday, and many other groups around the world too.
Type Of Technology	Software
Year Produced	2022
Open Source License?	Yes
Impact	This software leverages hardware acceleration (like GPU) to speed up a new family of algorithms, called Quality-Diversity Algorithms, by a factor of 100x. This means that instead of waiting days to get our results, we can now achieve the same outcomes in a few minutes. In less than 1 year, the Github repository already collected 200 stars and is
URL	https://github.com/adaptive-intelligent-robotics/QDax/

Abstract

Organisations

People

ORCID iD

Publications