REcoVER: Learning algorithms for REsilient and VErsatile Robots

Lead Research Organisation: Imperial College London
Department Name: Computing

Abstract

Robots have the potential to deliver tremendous benefits to our society by assisting us in all aspects of our everyday life. For example, they could increase the quality of life of elderly people by allowing them to stay longer at home on their own, through preparing meals, cleaning the house, and assisting them to get dressed. However, robots such as legged robots are also very complex machines, which are highly prone to damage when they are not operating in the well-controlled environments of factories. Moreover, because of this complexity and the large variety of environments they might encounter, it impossible for engineers to anticipate all the damage situations that the robot may encounter and to program its reactions accordingly.

A promising approach to overcome this difficulty is to enable robots to learn on their own how to face and how to respond to the different situations they encounter. This approach shares similarities with the way humans and animals react in analogous circumstances. For instance, a child with a sprained ankle learns on his own how to walk with only one foot in order to minimise the pain. The objective of this research project is to develop the algorithmic foundations that allow robots to do the same. In previous works, we have developed creative learning algorithms that enable (physical) legged robots to overcome the loss of a leg by learning how to walk forward in less than two minutes. However, in these works, the algorithms were configured to solve a single task (i.e., walking forward), which does not leverage the versatility of legged robots and their capability, for instance, to walk in every direction, to jump, and to crawl.

The ambition of this project is to extend the adaptation capabilities of our algorithms to the entire range of the robots' abilities. This will be achieved by employing recent advances in hierarchical reinforcement learning to transfer knowledge during the adaptation process across the different skills of the robots. The combination of these hierarchical skill repertoires with our online-adaptation algorithms will enable robots to quickly transfer the result of their adaptation on one skill to the other skills. For instance, after finding a new way to walk forward, a robot might have discovered that it cannot rely on its front-left leg. With the proposed project, this information will be automatically used by the robot to speed-up the adaptation process when it will try, for instance, to learn to turn by avoiding to use the front-left leg too. In addition to damage recovery, the same algorithm will enable robots to adapt from changes in their environment, for instance by changing their behaviours depending on whether they walk on flat concrete floor or on sloping grassy ground.

Increasing the adaptation capabilities of versatile robots aims in the long term to enable the use of robots to substitutes humans in the most dangerous task they have to perform. For instance, thanks to robots with improved adaptation abilities, it would be possible to send robots searching for survivors after an earthquake or to operate in a nuclear plant after a disaster. Improving the ability of robots to overcome unknown situations is one of the key requirements to enable them to be a significant part of our daily life.

This research will be undertaken at Imperial College London, in the department of computing. The project will benefit from state of the art robotic facilities, including a quadruped robot, a hexapod robot and a motion capture system, to develop and experiment a new generation of learning algorithms for resilient robots.
 
Description Learning Introspective Control
Amount $1,600,000 (USD)
Organisation Defense Advanced Research Projects Agency (DARPA) 
Sector Public
Country United States
Start 11/2022 
End 11/2025
 
Title QDax: Accelerated Quality-Diversity 
Description QDax is a tool to accelerate Quality-Diversity (QD) and neuro-evolution algorithms through hardware accelerators and massive parallelization. QD algorithms usually take days/weeks to run on large CPU clusters. With QDax, QD algorithms can now be run in minutes! QDax has been developed as a research framework: it is flexible and easy to extend and build on and can be used for any problem setting. My entire research group is now using this tool everyday, and many other groups around the world too. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact This software leverages hardware acceleration (like GPU) to speed up a new family of algorithms, called Quality-Diversity Algorithms, by a factor of 100x. This means that instead of waiting days to get our results, we can now achieve the same outcomes in a few minutes. In less than 1 year, the Github repository already collected 200 stars and is 
URL https://github.com/adaptive-intelligent-robotics/QDax/