REcoVER: Learning algorithms for REsilient and VErsatile Robots

Lead Research Organisation: Imperial College London

Department Name: Computing

Abstract

Robots have the potential to deliver tremendous benefits to our society by assisting us in all aspects of our everyday life. For example, they could increase the quality of life of elderly people by allowing them to stay longer at home on their own, through preparing meals, cleaning the house, and assisting them to get dressed. However, robots such as legged robots are also very complex machines, which are highly prone to damage when they are not operating in the well-controlled environments of factories. Moreover, because of this complexity and the large variety of environments they might encounter, it impossible for engineers to anticipate all the damage situations that the robot may encounter and to program its reactions accordingly.

A promising approach to overcome this difficulty is to enable robots to learn on their own how to face and how to respond to the different situations they encounter. This approach shares similarities with the way humans and animals react in analogous circumstances. For instance, a child with a sprained ankle learns on his own how to walk with only one foot in order to minimise the pain. The objective of this research project is to develop the algorithmic foundations that allow robots to do the same. In previous works, we have developed creative learning algorithms that enable (physical) legged robots to overcome the loss of a leg by learning how to walk forward in less than two minutes. However, in these works, the algorithms were configured to solve a single task (i.e., walking forward), which does not leverage the versatility of legged robots and their capability, for instance, to walk in every direction, to jump, and to crawl.

The ambition of this project is to extend the adaptation capabilities of our algorithms to the entire range of the robots' abilities. This will be achieved by employing recent advances in hierarchical reinforcement learning to transfer knowledge during the adaptation process across the different skills of the robots. The combination of these hierarchical skill repertoires with our online-adaptation algorithms will enable robots to quickly transfer the result of their adaptation on one skill to the other skills. For instance, after finding a new way to walk forward, a robot might have discovered that it cannot rely on its front-left leg. With the proposed project, this information will be automatically used by the robot to speed-up the adaptation process when it will try, for instance, to learn to turn by avoiding to use the front-left leg too. In addition to damage recovery, the same algorithm will enable robots to adapt from changes in their environment, for instance by changing their behaviours depending on whether they walk on flat concrete floor or on sloping grassy ground.

Increasing the adaptation capabilities of versatile robots aims in the long term to enable the use of robots to substitutes humans in the most dangerous task they have to perform. For instance, thanks to robots with improved adaptation abilities, it would be possible to send robots searching for survivors after an earthquake or to operate in a nuclear plant after a disaster. Improving the ability of robots to overcome unknown situations is one of the key requirements to enable them to be a significant part of our daily life.

This research will be undertaken at Imperial College London, in the department of computing. The project will benefit from state of the art robotic facilities, including a quadruped robot, a hexapod robot and a motion capture system, to develop and experiment a new generation of learning algorithms for resilient robots.

Funded Value:

£285,285

Funded Period:

Mar 21 - Feb 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/V006673/1

Principal Investigator:

Antoine Cully

Research Subject:

Info. & commun. Technol. (80%)

Mechanical engineering (20%)

Research Topic:

Artificial Intelligence (80%)

Robotics & Autonomy (20%)

Organisations

People	ORCID iD
Antoine Cully (Principal Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 4 > >|

10 25 50

Allard M (2022) Hierarchical quality-diversity for online damage recovery

Allard M (2023) Online Damage Recovery for Physical Robots with Hierarchical Quality-Diversity in ACM Transactions on Evolutionary Learning and Optimization

Allard M (2022) Online Damage Recovery for Physical Robots with Hierarchical Quality-Diversity

Allard M (2022) Hierarchical Quality-Diversity for Online Damage Recovery

Cully A (2021) Quality-diversity optimisation

Cully A (2022) Quality-diversity optimisation

Cully A (2021) Multi-emitter MAP-elites

Flageat M (2023) Multiple Hands Make Light Work: Enhancing Quality and Diversity using MAP-Elites with Multiple Parallel Evolution Strategies

Flageat M (2023) Empirical analysis of PGA-MAP-Elites for Neuroevolution in Uncertain Domains in ACM Transactions on Evolutionary Learning and Optimization

Flageat M (2022) Benchmarking Quality-Diversity Algorithms on Neuroevolution for Reinforcement Learning

Key Findings
Impact Summary
Further Funding
Software and Technical Products


Description	During the RECOVER project, we invented several novel algorithms that enable robots to rapidly recover from unforeseen mechanical damages. For instance, a 6-legged robot can rapidly recover from losing one leg, without any diagnostic or dedicated sensors, and simultaneously continue with its mission, such as maze navigation. The main insight to achieve this is to enable the robot to learn a large collection of different ways to walk. For instance, with a 6-legged robot, this means using different all its legs to move. When damage occurs, our algorithms will rapidly search the collection of different gaits to find one that still works well despite the ongoing damage. This is similar to building a large collection of backup plans. However, this collection of plans does not assume any specific damage condition. Instead, it searches all the different ways the robot could use to walk when being intact. This is analogous to children: a child will naturally learn a lot of alternative ways to walk, like hopping on one leg or walking on all fours, simply because it is fun. This diversity of walking gait will however become instrumental if the child experiences a sprained ankle: the child will be able to instantaneously switch to one of these alternative gaits, as it minimises the pain. RECOVER is based on the same principle. There is however a limitation: if we need a large number (like thousands) of alternatives for every single skill that our robot has to execute to achieve its missing, then the total number of alternatives that have to be learnt will become intractable (i.e., millions of alternatives). To solve this challenge, we proposed in RECOVER to decompose every skill into a tree of sub-skills. For instance, walking forward can be decomposed into a succession of steps, which can all be decomposed into a series of movements of the legs. Interestingly, these sub-skills can be shared across higher-skills. For instance, walking backwards also requires a series of leg movements. This hierarchical decomposition of skills can be used then to enforce diversity of alternatives at the sub-skills level and thus maintain the number of high-level skills tractable. In practice, we showed during the RECOVER project that this allows a 6-legged robot to be able to autonomously recover from the unexpected loss of one of its legs, while simultaneously performing complex maze navigation tasks, which require locomotion skills to go in every possible direction.
Exploitation Route	This new technology can be used by a large variety of applications, such as transport vehicles and critical infrastructure to enable rapid recovery after unexpected damage or perturbation in the environment. Within follow-up projects, we are currently investigating how the findings of RECOVER can be used to make cars and ground vehicles safer to drive under a large range of perturbations (flat tyres, partial loss of traction, destabilizing payloads).
Sectors	Aerospace Defence and Marine Transport


Description	The novel algorithms developed during the RECOVER project demonstrated that artificial intelligence and machine learning can be instrumental in enabling machines and infrastructures to adapt to unforeseen situations. After publishing several works in this direction, we have been invited to participate in international studies to apply our algorithms and findings on new types of robots (similar to cars, and boats) to showcase the potential of this technology to make transport and infrastructure safer and more resilient.
First Year Of Impact	2023
Sector	Aerospace, Defence and Marine,Transport
Impact Types	Societal


Description	Learning Introspective Control
Amount	$1,400,000 (USD)
Organisation	Defense Advanced Research Projects Agency (DARPA)
Sector	Public
Country	United States
Start	11/2022
End	11/2025


Title	QDax: Accelerated Quality-Diversity
Description	QDax is a tool to accelerate Quality-Diversity (QD) and neuro-evolution algorithms through hardware accelerators and massive parallelization. QD algorithms usually take days/weeks to run on large CPU clusters. With QDax, QD algorithms can now be run in minutes! QDax has been developed as a research framework: it is flexible and easy to extend and build on and can be used for any problem setting. My entire research group is now using this tool everyday, and many other groups around the world too.
Type Of Technology	Software
Year Produced	2022
Open Source License?	Yes
Impact	This software leverages hardware acceleration (like GPU) to speed up a new family of algorithms, called Quality-Diversity Algorithms, by a factor of 100x. This means that instead of waiting days to get our results, we can now achieve the same outcomes in a few minutes. In less than 1 year, the Github repository already collected 200 stars and is
URL	https://github.com/adaptive-intelligent-robotics/QDax/

Abstract

Organisations

People

ORCID iD

Publications