Humanlike physics understanding for autonomous robots

Lead Research Organisation: University of Leeds
Department Name: Sch of Computing


How do you grasp a bottle of milk, nestling behind some yoghurt pots, within a cluttered fridge? Whilst humans are able to use visual information to plan and select such skilled actions with external objects with great ease and rapidity - a facility acquired in the history of the species and as a child develops - *robots struggle*. Indeed, whilst artificial intelligence has made great leaps in beating the best of humanity in tasks such as chess and Go, the planning and execution abilities of today's robotic technology is trumped by the average toddler. Given the complex and unpredictable world within which we find ourselves situated, these apparently trivial tasks are the product of highly sophisticated neural computations that generalise and adapt to changing situations: continually engaging in a process of selecting between multiple goals and action options. Our aim is to investigate how such computations could be transferred to robots to enable them to manipulate objects more efficiently, in a more human-like way than is presently the case, and to be able to perform manipulation presently beyond the state of the art.

Let us return to the fridge example: You need to first decide what yoghurt pot is best to remove to allow access to the milk bottle and then generate the appropriate movements to grasp the pot safely- the *pre-contact *phase of prehension. You then need to decide what type of forces to apply to the pot (push it to the left or the right, nudge it or possibly lift it up and place the pot on another shelf etc) i.e. the *contact* phase. Whilst these steps happen with speed and automaticity in real time, we will probe these processes in laboratory controlled situations to systematically examine the pre-contact and contact phases of prehension to determine what factors (spatial position, size of pot, texture of pot etc) bias humans to choose one action (or series of actions) over other possibilities. We hypothesise that we can extract a set of high level rules, expressed using qualitative spatio-temporal formalisms which can capture the essence of such expertise, in combination with more quantitative lower-level representations and reasoning.

We will develop a computational model to provide a formal foundation for testing hypotheses about the factors biasing behaviour and ultimately use this model to predict the behaviour that will most probably occur in response to a given perceptual (visual) input in this context. We reason that a computational understanding of how humans perform these actions can bridge the robot-human skill gap.

State-of-the-art robot motion/manipulation planners use probabilistic methods (random sampling e.g. RRTs, PRMs, is the dominant motion planning approach in the field today). Hence, planners are not able to explain their decisions, similar to the "black box" machine learning methods mentioned in the call which produce inscrutable models. However, if robots can generate human-like interactions with the world, and if they can use knowledge of human action selection for planning, then this would allow robots to explain why they perform manipulations in a particular way, and also facilitate "legible manipulation" - i.e. action which is predictable by humans since it closely corresponds to how humans would behave, a goal of some recent research in the robotics community.

The work will shed light on the use of perceptual information in the control of action - a topic of great academic interest and simultaneously have direct relevance to a number of practical problems facing roboticists seeking to control robots working in cluttered environments: from a robot picking items in a warehouse, to novel surgical technologies requiring discrimination between healthy and cancerous tissue.

Planned Impact

This project has the potential to lead to major advances in situations where human skills exceed modern robot capabilities and thus will impact on a number of user groups.

Societal Impact:

1. Human-modelled robotics that are capable of grasping and manipulating objects will be crucial for future service robots deployed to help people in their daily lives (from being in our homes and supporting tasks such as fetching the remote through to hospitals and supporting the care needs of immobile patients).

2. To model robots beyond repetitive factory line tasks and specialist labour settings into homes to benefit end-users- more variable uncertain and unstructured environments- where goals locations are not pre-defined- a model that learns and generalises (like a human) is necessary for robustness. Our work will provide a proof of concept that can be extrapolated to applications that involve novel and dynamic contexts (and this would be an aim for future work).

3. Explaining the Black box: The general public will benefit- if robots have human-inspired actions and more transparent reasoning for their decision-making, interactions with humans increase the acceptability of these devices and confidence in their capabilities. Similarly, modelling natural human interaction can improve the design of requirements for human-robotic interactions (HRI). Successful co-operation in HRI is a fundamentally challenge-this framework would help improve spatial and temporal co-ordination of activities.

Impact on knowledge base:

1. Artificial intelligence researchers and roboticists, through the demonstration of human-inspired control-schemes enabling skilful robotic interactions with the environment and robotic control designers through a more precise specification of how robotic systems can interact with the underlying visual-motor action selection of the user.

2. If robots share behavioural characteristics of humans, these systems can be used to provide insights into the frailties of human decision-making and the aetiology these biases- can present an alternative to animal models- inform how we understand motor behaviour in individuals with neurological conditions in ways that would be impossible or unethical to study in humans.

3. Facilitating the cross pollination of ideas: Research staff engaged in the project through exposure to methodologies that range from artificial intelligence, motion planning, reinforcement learning, decision-making, cognitive science and engineering solutions. The project will promote working in tandem to develop for mutually beneficial advances in our understanding of human perceptual-motor behaviour e.g. through computational modelling of action selection to progress the sophistication of robotic technology.

Economic impact:

Increasing the productivity of businesses. Here, we will focus our application on picking robots in warehouses with a particular focus on e-commerce. One of our test cases will involve competing in the Amazon Picking Challenge- improvements in these systems will yield tangible benefits in efficiency and cost-savings to businesses - allowing them to process and deliver orders faster. Natural extensions of this work are to other situations where skilled planning and motor actions e.g. autonomous vehicles, search and rescue robots.

What we will do to ensure that benefits are realised?
1) Our publication strategy will focus on targeting high-impact engineering and neuroscience outlets (e.g. IJRR, Autonomous Systems, the Journal of Neuroscience as well as conferences such as ICRA)

2) Conference presentations to robotics, computer science & psychology research audiences

3) Liaison with industrial partners on the potential for knowledge transfer

4) Dissemination with policy bodies guided by Nexus and our industrial partners.


10 25 50

publication icon
Wang H (2021) Spatio-Temporal Manifold Learning for Human Motions via Long-Horizon Modeling. in IEEE transactions on visualization and computer graphics

publication icon
Osnes C (2021) Investigating the construct validity of a haptic virtual caries simulation for dental education. in BMJ simulation & technology enhanced learning

publication icon
Narvekar S. (2020) Curriculum learning for reinforcement learning domains: A framework and survey in Journal of Machine Learning Research

publication icon
Hua H. (2022) Towards Explainable Action Recognition by Salient Qualitative Spatial Object Relation Chains in Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022

publication icon
Hua H (2022) Towards Explainable Action Recognition by Salient Qualitative Spatial Object Relation Chains in Proceedings of the AAAI Conference on Artificial Intelligence

publication icon
Gutierrez R.L. (2020) Information-theoretic task selection for meta-reinforcement learning in Advances in Neural Information Processing Systems

publication icon
Chen W (2020) Dynamic Future Net

publication icon
Bejjani W (2021) Learning image-based Receding Horizon Planning for manipulation in clutter in Robotics and Autonomous Systems

publication icon
Baniqued PDE (2021) Brain-computer interface robotics for hand rehabilitation after stroke: a systematic review. in Journal of neuroengineering and rehabilitation

publication icon
Balkhoyor AM (2020) Frontal theta brain activity varies as a function of surgical experience and task error. in BMJ surgery, interventions, & health technologies

publication icon
Al-Saud LM (2020) Early assessment with a virtual reality haptic simulator predicts performance in clinical practice. in BMJ simulation & technology enhanced learning

publication icon
Akhtiamov D (2021) Spatial representability of neuronal activity. in Scientific reports

Description A human like planner has been created to allow a robot to improve its performance in manipulating objects on a cluttered surface. This was achieved by learning from a dataset of humans performing this task in a virtual environment. The new planner is not only faster than a state of the art stochastic planner, but also has been shown to build plans which are similar to those which a human would construct.

For similar reaching through clutter tasks we have also investigated "human-in-the-loop" approaches and reinforcement learning approaches. Using human-in-the-loop, a robot planner tries to solve a given problem by itself, but if the problem proves to be difficult, it requests high-level human input. The high level human input is quick and easy for the human to provide: the human simply clicks on an obstacle object and clicks on a suggested new pose. We showed that with minimal human time, there can be drastic increases in the robotic planning success rates. Using the reinforcement learning approach, we have combined the recent advances in deep policy learning with forward chaining planning methods. Our results show the trade-off between model-based planning methods and data-driven learning methods for the problem of physics-based manipulation.

Another avenue of research has been how humans learn frm their mistakes with the aim of endowing robots with an ability to learn from their mistakes; for this to happen it must resolve whether decisions that fail to produce rewards are due to poorly selected action plans or badly executed movements. Humans are remarkably adept at solving such credit assignment problems, but the specific neural processes involved in this error process have, to date, been unclear. In new published work we show that neural activity associated with reinforcement learning, a medial frontal negative deflection in scalp-recorded EEG activity, is able to discriminate between these errors and provides novel insight into how the brain responds to different classes of error that determine future action. In our follow up grant, we will be examining the utility of using this neural signature to help robots "learn like humans".
Exploitation Route We have already met with our industrial collaborators so that they can consider if the results would help them in their work. We are presenting the work at ICRA 2020 which is one of the major robotics conferences. The code and dataset are available as a git hub repository.
Sectors Digital/Communication/Information Technologies (including Software)



including Industrial Biotechology


Description Findings of this project led to discussions with companies in the industry (companies including Amazon Robotics, Cavendish Nuclear, Advanced Supply Chain Group, Zebra Technologies, Bosch Research, Asda Logistics). These discussions led to new proposed projects on using robots in manufacturing and warehouses. One such project was funded, with support from Amazon Robotics and Advanced Supply Chain Group. This project is currently active, and the goal is to make the UK competitive in robotic automation of warehouse picking and packing. This is an area with increasing activity in many leading countries, and the outputs of this project will help the UK to stay at the cutting edge of technology in automated warehouse picking and packing. We also have recently engaged in a meeting with Microsoft Research, where we presented findings related to this project, and the possibility of using robotic manipulation technologies in Data Centre Maintenance.
First Year Of Impact 2022
Sector Retail
Impact Types Economic

Description ESRC 1+3 White Rose Doctoral Studentship in Artificial Intelligence
Amount £78,036 (GBP)
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 09/2019 
End 09/2022
Description Robotic picking and packing with physical reasoning
Amount £1,196,800 (GBP)
Funding ID EP/V052659/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 12/2021 
End 11/2026
Description The Equitable, Inclusive, and Human-Centered XR Project
Amount £150,000 (GBP)
Funding ID 10039307 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 11/2022 
End 10/2025
Title Code base for Human Like Planner published in ICRA 2020 
Description Code for the experiments conducted with Humans in a VR setting manipulating objects in a cluttered table top environment. 
Type Of Material Computer model/algorithm 
Year Produced 2020 
Provided To Others? Yes  
Impact ICRA 2020 paper accepted. 
Title Dataset for Human Like Planner published in ICRA 2020 
Description Dataset contains the experiments conducted with Humans in a VR setting manipulating objects in a cluttered table top environment. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact ICRA 2020 paper accepted. 
Title Learning manipulation planning from VR human demonstrations 
Description The objective of this project is learning high-level manipulation planning skills from humans and transfer these skills to robot planners. We used virtual reality to generate data from human participants whilst they reached for objects on a cluttered table top. From this, we devised a qualitative representation of the task space to abstract human decisions, irrespective of the number of objects in the way. Based on this representation, human demonstrations were segmented and used to train decision classifiers. Using these classifiers, our planner produced a list of waypoints in the task space. These waypoints provide a high-level plan, which can be transferred to any arbitrary robot model. The VR dataset is released here. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Description Multi-object manipulation (with UC Berkeley) 
Organisation University of California, Berkeley
Country United States 
Sector Academic/University 
PI Contribution This is a collaboration between our group at University of Leeds and Prof. Ken Goldberg's group at UC Berkeley, USA, who are also partners of my EPSRC Fellowship. We have contributed to the development of methods for the manipulation and grasping of mutiple, rigid and deformable, objects by robots. We developed an initial prototype of the multi-object grasping system in Leeds. The system was further developed at UC Berkeley, together with a visitor from our team, and was recently generalized to deformable objects (e.g., garments/clothes). We also contributed with help to the writing of publications.
Collaborator Contribution The UC Berkeley team developed the core system as well as the algorithms methods, in collaboration with our team. They have performed most of the experimental work in their labs. They also led the writing of the publications.
Impact The collaboration resulted in four publications so far.
Start Year 2021
Description Physics-based model-predictive control for object manipulation (with ETH Zurich) 
Organisation ETH Zurich
Country Switzerland 
Sector Academic/University 
PI Contribution I visited Prof. Robert Katzschmann's group in ETH Zurich, who are also partners on my EPSRC Fellowship. During the visit, I learned about the tendon-driven, anthropomorphic robotic hand developed by Prof. Katzschmann's group, as well as the different learning/AI methods they are exploring to perform autonomous manipulation with this hand. I have contributed with inputs to the developments of such methods, particularly leading a method based on physics-based model-predictive control. This work is still ongoing.
Collaborator Contribution Prof. Katzschmann's group contributed by providing the robotic hand hardware, as well as all other necessary tools for the development.
Impact This collaboration does not have any outputs yet. It is ongoing work.
Start Year 2023
Description Use of tactile sensors in physics-based manipulation (with Univ. of Bristol) 
Organisation University of Bristol
Country United Kingdom 
Sector Academic/University 
PI Contribution Integrating tactile sensors developed by Prof. Nathan Lepora's group at University of Bristol (who are partners on my EPSRC Fellowship) into our physics-based object perception algorithms and methods.
Collaborator Contribution Prof. Nathan Lepora's group contributed by teaching us how to manufacture one of their tactip sensors, including an on-site training of the PDRA, and also help with the software that accompanies this sensor.
Impact Ongoing collaboration. No outputs yet.
Start Year 2023
Description Visit by ASDA Logistics Services 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Industry/Business
Results and Impact A visit was performed by Asda Logistics Services to our labs to discuss possible applications of our object manipulation methods, in Asda warehouses. This was then followed by a visit from our researchers to Asda warehouses.
Year(s) Of Engagement Activity 2023