Humanlike physics understanding for autonomous robots

Lead Research Organisation: University of Leeds
Department Name: Sch of Computing


How do you grasp a bottle of milk, nestling behind some yoghurt pots, within a cluttered fridge? Whilst humans are able to use visual information to plan and select such skilled actions with external objects with great ease and rapidity - a facility acquired in the history of the species and as a child develops - *robots struggle*. Indeed, whilst artificial intelligence has made great leaps in beating the best of humanity in tasks such as chess and Go, the planning and execution abilities of today's robotic technology is trumped by the average toddler. Given the complex and unpredictable world within which we find ourselves situated, these apparently trivial tasks are the product of highly sophisticated neural computations that generalise and adapt to changing situations: continually engaging in a process of selecting between multiple goals and action options. Our aim is to investigate how such computations could be transferred to robots to enable them to manipulate objects more efficiently, in a more human-like way than is presently the case, and to be able to perform manipulation presently beyond the state of the art.

Let us return to the fridge example: You need to first decide what yoghurt pot is best to remove to allow access to the milk bottle and then generate the appropriate movements to grasp the pot safely- the *pre-contact *phase of prehension. You then need to decide what type of forces to apply to the pot (push it to the left or the right, nudge it or possibly lift it up and place the pot on another shelf etc) i.e. the *contact* phase. Whilst these steps happen with speed and automaticity in real time, we will probe these processes in laboratory controlled situations to systematically examine the pre-contact and contact phases of prehension to determine what factors (spatial position, size of pot, texture of pot etc) bias humans to choose one action (or series of actions) over other possibilities. We hypothesise that we can extract a set of high level rules, expressed using qualitative spatio-temporal formalisms which can capture the essence of such expertise, in combination with more quantitative lower-level representations and reasoning.

We will develop a computational model to provide a formal foundation for testing hypotheses about the factors biasing behaviour and ultimately use this model to predict the behaviour that will most probably occur in response to a given perceptual (visual) input in this context. We reason that a computational understanding of how humans perform these actions can bridge the robot-human skill gap.

State-of-the-art robot motion/manipulation planners use probabilistic methods (random sampling e.g. RRTs, PRMs, is the dominant motion planning approach in the field today). Hence, planners are not able to explain their decisions, similar to the "black box" machine learning methods mentioned in the call which produce inscrutable models. However, if robots can generate human-like interactions with the world, and if they can use knowledge of human action selection for planning, then this would allow robots to explain why they perform manipulations in a particular way, and also facilitate "legible manipulation" - i.e. action which is predictable by humans since it closely corresponds to how humans would behave, a goal of some recent research in the robotics community.

The work will shed light on the use of perceptual information in the control of action - a topic of great academic interest and simultaneously have direct relevance to a number of practical problems facing roboticists seeking to control robots working in cluttered environments: from a robot picking items in a warehouse, to novel surgical technologies requiring discrimination between healthy and cancerous tissue.

Planned Impact

This project has the potential to lead to major advances in situations where human skills exceed modern robot capabilities and thus will impact on a number of user groups.

Societal Impact:

1. Human-modelled robotics that are capable of grasping and manipulating objects will be crucial for future service robots deployed to help people in their daily lives (from being in our homes and supporting tasks such as fetching the remote through to hospitals and supporting the care needs of immobile patients).

2. To model robots beyond repetitive factory line tasks and specialist labour settings into homes to benefit end-users- more variable uncertain and unstructured environments- where goals locations are not pre-defined- a model that learns and generalises (like a human) is necessary for robustness. Our work will provide a proof of concept that can be extrapolated to applications that involve novel and dynamic contexts (and this would be an aim for future work).

3. Explaining the Black box: The general public will benefit- if robots have human-inspired actions and more transparent reasoning for their decision-making, interactions with humans increase the acceptability of these devices and confidence in their capabilities. Similarly, modelling natural human interaction can improve the design of requirements for human-robotic interactions (HRI). Successful co-operation in HRI is a fundamentally challenge-this framework would help improve spatial and temporal co-ordination of activities.

Impact on knowledge base:

1. Artificial intelligence researchers and roboticists, through the demonstration of human-inspired control-schemes enabling skilful robotic interactions with the environment and robotic control designers through a more precise specification of how robotic systems can interact with the underlying visual-motor action selection of the user.

2. If robots share behavioural characteristics of humans, these systems can be used to provide insights into the frailties of human decision-making and the aetiology these biases- can present an alternative to animal models- inform how we understand motor behaviour in individuals with neurological conditions in ways that would be impossible or unethical to study in humans.

3. Facilitating the cross pollination of ideas: Research staff engaged in the project through exposure to methodologies that range from artificial intelligence, motion planning, reinforcement learning, decision-making, cognitive science and engineering solutions. The project will promote working in tandem to develop for mutually beneficial advances in our understanding of human perceptual-motor behaviour e.g. through computational modelling of action selection to progress the sophistication of robotic technology.

Economic impact:

Increasing the productivity of businesses. Here, we will focus our application on picking robots in warehouses with a particular focus on e-commerce. One of our test cases will involve competing in the Amazon Picking Challenge- improvements in these systems will yield tangible benefits in efficiency and cost-savings to businesses - allowing them to process and deliver orders faster. Natural extensions of this work are to other situations where skilled planning and motor actions e.g. autonomous vehicles, search and rescue robots.

What we will do to ensure that benefits are realised?
1) Our publication strategy will focus on targeting high-impact engineering and neuroscience outlets (e.g. IJRR, Autonomous Systems, the Journal of Neuroscience as well as conferences such as ICRA)

2) Conference presentations to robotics, computer science & psychology research audiences

3) Liaison with industrial partners on the potential for knowledge transfer

4) Dissemination with policy bodies guided by Nexus and our industrial partners.


10 25 50
Description A human like planner has been created to allow a robot to improve its performance in manipulating objects on a cluttered surface. This was achieved by learning from a dataset of humans performing this task in a virtual environment. The new planner is not only faster than a state of the art stochastic planner, but also has been shown to build plans which are similar to those which a human would construct.

For similar reaching through clutter tasks we have also investigated "human-in-the-loop" approaches and reinforcement learning approaches. Using human-in-the-loop, a robot planner tries to solve a given problem by itself, but if the problem proves to be difficult, it requests high-level human input. The high level human input is quick and easy for the human to provide: the human simply clicks on an obstacle object and clicks on a suggested new pose. We showed that with minimal human time, there can be drastic increases in the robotic planning success rates. Using the reinforcement learning approach, we have combined the recent advances in deep policy learning with forward chaining planning methods. Our results show the trade-off between model-based planning methods and data-driven learning methods for the problem of physics-based manipulation.

Another avenue of research has been how humans learn frm their mistakes with the aim of endowing robots with an ability to learn from their mistakes; for this to happen it must resolve whether decisions that fail to produce rewards are due to poorly selected action plans or badly executed movements. Humans are remarkably adept at solving such credit assignment problems, but the specific neural processes involved in this error process have, to date, been unclear. In new published work we show that neural activity associated with reinforcement learning, a medial frontal negative deflection in scalp-recorded EEG activity, is able to discriminate between these errors and provides novel insight into how the brain responds to different classes of error that determine future action. In our follow up grant, we will be examining the utility of using this neural signature to help robots "learn like humans".
Exploitation Route We have already met with our industrial collaborators so that they can consider if the results would help them in their work. We are presenting the work at ICRA 2020 which is one of the major robotics conferences. The code and dataset are available as a git hub repository.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Manufacturing, including Industrial Biotechology,Retail

Description ESRC 1+3 White Rose Doctoral Studentship in Artificial Intelligence
Amount £78,036 (GBP)
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 10/2019 
End 09/2022
Title Code base for Human Like Planner published in ICRA 2020 
Description Code for the experiments conducted with Humans in a VR setting manipulating objects in a cluttered table top environment. 
Type Of Material Computer model/algorithm 
Year Produced 2020 
Provided To Others? Yes  
Impact ICRA 2020 paper accepted. 
Title Dataset for Human Like Planner published in ICRA 2020 
Description Dataset contains the experiments conducted with Humans in a VR setting manipulating objects in a cluttered table top environment. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact ICRA 2020 paper accepted.