Perceiving, Modelling and Interacting with the Object-Based World

Lead Research Organisation: Imperial College London

Department Name: Computing

Abstract

"Perceiving, Modelling and Interacting Autonomously in a Dynamic Object-Based World"

The Dyson Robotics Lab at Imperial College was founded in 2014 as a collaboration between Dyson Technology Ltd and Imperial College. It is the culmination of a thirteen-year partnership between Professor Andrew Davison and Dyson to bring his Simultaneous Localisation and Mapping (SLAM) algorithms out of the laboratory and into commercial robots, resulting in Dyson's 360 Eye vision-based vacuum cleaning robot in 2015 which can map its surroundings, localise and plan systematic cleaning pattern. Our success in working together made it clear that computer vision is a key enabling technology for future robots. This proposal aims to fund the Lab to push the forefront of visual scene understanding and vision-enabled robotic manipulation into new and more demanding application areas.

The research activity we are outlining in this Prosperity Partnership complements the large internal R&D investment that Dyson is making to to created advanced robotic products. The aims of this partnership are to invent and prototype the breakthrough robot vision algorithms which could truly take us to next generation capability for advanced robotics working in unstructured environments, and to transfer this technology into the long-term product pipeline of Dyson as they aim to open up new product categories.

Dyson has now been working on robotics for nearly 20 years, a period during which the emergence of real consumer robotic products has happened alongside astounding progress in academic research in the broad field of AI. At the present time, floor cleaners are still the only category of mass-market robot which have achieved significant commercial success. This can be put down simply to the greater difficulty of the other more complex tasks and chores that a consumer might want an autonomous product to achieve. These tasks place much larger demands on a robotic system to understand and interact with its complicated 3D surroundings and the objects they contain. This programme will focus on creating the research breakthroughs needed to enable this next generation capability.

There are scene perception and modelling competences which underly all of these use cases, and these will be our research focus as we develop the algorithms behind next-generation object-based SLAM systems by combining all of our knowledge in state-based estimation and machine learning. We will also work more specifically on the methods for training learning systems; methods for advanced vision-guided manipulation; and the frameworks needed for practical, contextual human-robot interaction. The core scientific work will be forward-looking and academic, but always with a strong guidance from our partners at Dyson.

Planned Impact

Domestic robotics has been forecast to be a major global growth sector (£1B currently to £20B by 2025), and Dyson is well-positioned to be a key driver of this growth with its investments into personnel, research and facilities over the past five years. It is often stated that the UK should aim at leadership in AI, but Dyson is one of the very few UK companies aiming seriously at making that happen at scale with real robot products already on the market and sold worldwide.
Robotic vacuum cleaners and lawnmowers have barely scratched the surface of the possibilities that domestic robotics represent, and the research programme presented in this proposal tackles key fundamental challenges that need to be addressed in order to produce robots that can perform useful functions in the real world alongside humans. Furthermore, specific provisions have been made to ensure a continuous pipeline of technologies from low TRL research at the Laboratory all the way through to high TRL commercial deployment by Dyson and as such Prosperity Partnership funding will allow us to provide a much better flow from technology to industry.

Besides direct wealth creation, the existence of a centre of knowledge and excellence in the field of vision-enabled robotic manipulation will act as a nexus to draw (and retain) much needed skills, capabilities and investment into the UK. The Laboratory already has a good track record of attracting some of the top research talent in the world, and Prosperity Partnership funding will greatly aid in maintaining this attractiveness. The close involvement of Dyson engineers with the Laboratory will provide researchers with grounding in real-world challenges besides giving them a flavour of life in industry should they be interested in pursuing non-academic careers - Dyson most certainly will require many more robotics engineers.
SLAM and its evolution into general robotic spatial awareness remain key areas of interest in the academic disciplines of robotics, computer vision and AI, and we intend to make fundamental and high impact published scientific contributions during the project. Our track record in consistently encouraging and helping students and PDRAs to publish at this level speaks for itself. Doubtless, the research outputs of the Laboratory will also find application in a whole host of different sectors, for example medical robotics, construction, disaster relief, assisted living and manufacturing, all of which require robots that can interact in real-time with complex, dynamic environments.

From a societal point of view, the advent of domestic robots capable of performing a whole range of tasks around the home can be expected to improve the quality of life by reducing the amount of time that is devoted to performing household chores. This will certainly have the largest impact on homemakers and, given that a greater majority of homemakers are still women, the widespread adoption of domestic robotics might also be expected to have beneficial knock-on effects on gender equality issues and household wage demographics. A further corollary to the beneficial effects of domestic robotics is the great potential to extend the independence of an ageing population.

Finally, Dyson and Imperial College have a strong relationship with technical media outlets including The BBC, The Times, The Guardian and Wired Magazine, which will allow us to disseminate our results to a large and worldwide audience. We will also ensure that the key technical demonstrators are showcased at multiple events. Imperial Festival attracts over 5,000 people and generates considerable media interest. Through these engagements with the public, we aim to inspire the next generation of engineers and scientists and generally getting people excited about the potential behind robotics. Lastly, the Lab maintains an open website and this project will have its own space which will be continually updated with the latest developments.

Funded Value:

£2,066,563

Funded Period:

Jul 19 - Jun 24

Funder:

EPSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

EP/S036636/1

Principal Investigator:

Andrew Davison

Research Subject:

Electrical engineering (30%)

Info. & commun. Technol. (70%)

Research Topic:

Artificial Intelligence (20%)

Image & Vision Computing (50%)

Robotics & Autonomy (30%)

Organisations

People	ORCID iD
Andrew Davison (Principal Investigator)
Stefan Leutenegger (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Bonardi A (2020) Learning One-Shot Imitation from Humans without Humans

Czarnowski Jan (2020) DeepFactors: Real-Time Probabilistic Dense Monocular SLAM

Henning D (2022) Computer Vision - ECCV 2022 - 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXXIX

James S (2022) Q-Attention: Enabling Efficient Learning for Vision-Based Robotic Manipulation in IEEE Robotics and Automation Letters

James S (2020) RLBench: The Robot Learning Benchmark & Learning Environment.

Laidlow T (2020) Towards the Probabilistic Fusion of Learned Priors into Standard Pipelines for 3D Reconstruction

Landgraf Z (2020) Comparing View-Based and Map-Based Semantic Labelling in Real-Time SLAM

Landgraf Z (2021) SIMstack: A Generative Shape and Instance Model for Unordered Object Stacks.

Lenton D (2021) END-TO-END Egospheric Spatial Memory

Matsuki H (2021) CodeMapping: Real-Time Dense Mapping for Sparse SLAM using Compact Scene Representations in IEEE Robotics and Automation Letters

Patwardhan A (2023) Distributing Collaborative Multi-Robot Planning With Gaussian Belief Propagation in IEEE Robotics and Automation Letters

Scona R (2022) From Scene Flow to Visual Odometry Through Local and Global Regularisation in Markov Random Fields in IEEE Robotics and Automation Letters

Shuaifeng Z (2021) In-Place Scene Labelling and Understanding with Implicit Scene Representation.

Sucar E (2021) iMAP: Implicit Mapping and Positioning in Real Time.

Sucar E (2020) NodeSLAM: Neural Object Descriptors for Multi-View Shape Reconstruction

Wada K (2020) MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

Wada K (2022) ReorientBot: Learning Object Reorientation for Specific-Posed Placement

Wada K (2022) SafePicking: Learning Safe Object Extraction via Object-Level Mapping

Key Findings
Research Databases and Models
Collaboration
Software and Technical Products
Engagement Activities


Description	In scene understanding, we are focusing on compact representations that will enable efficient and real time SLAM applications. We have made some progress in this area with an algorithm that accurately represent the environment by focusing the learned networks on scene completion (Code Mapping). Moreover, we have shown that a small number of in-place annotations can also be used to achieve 3D semantic labelling (Semantic-NeRF). Humans can infer the shape of occluded objects with easy based on their prior knowledge of the geometric properties of the objects. As such we have used the same approach to estimate the shapes of objects in a pile from just a single view (SIMstack) and we also created a system (SafePicking) that integrates object-level mapping and learning- based motion planning to generate a motion to safely extract occluded target objects from a pile. We have also done a good progress in object-based scene modelling. Using the knowledge of the 3D geometry of particular objects, we developed an algorithm (NodeSLAM) that can build an accurate representation of the environment. However, we have also aim to build such representations without the use of prior data based on scene-specific implicit 3D model of occupancy and colour (iMAP). Finally, we developed an interactive 3D scene understanding system. A user annotates semantic properties via clicks while scanning and mapping a scene with a handheld RGB-D sensor. The scene model is updated and visualised in real-time allowing ultra-efficient labelling (iLabel). The work has also been extended to incorporate the physical properties of the objects through fully-autonomous experimental interactions (Haughton). We also have looked into vision-based manipulation systems; a system that uses a small number of demonstrations to perform simple tasks (Q-attention) and a system that rearranges a pile of objects by generating a collision-free motion planning (ReorientBot). Finally, we looked at generating an image representing a human-like arrangement of some given objects based on inferring a text description of those objects (DALL-E-Bot). Finally, we expect that a robot should be able in the future to operate in the same area with other autonomous robotic devices and thus a precise coordinated multi-robot planning motion will be required. Our first work in this area has been recently published (Patwardhan).
Exploitation Route	We have published this work in international venues as well as publishing open source software and datasets for the projects. This work is being taken forward within our research group, within Dyson, as well as by other research teams around the world.
Sectors	Digital/Communication/Information Technologies (including Software)
URL	https://www.imperial.ac.uk/dyson-robotics-lab/publications/


Title	RLBench: The Robot Learning Benchmark & Learning Environment
Description	We present a challenging new benchmark and learning-environment for robot learning: RLBench. The benchmark features 100 completely unique, hand-designed tasks ranging in difficulty, from simple target reaching and door opening, to longer multi-stage tasks, such as opening an oven and placing a tray in it. We provide an array of both proprioceptive observations and visual observations, which include rgb, depth, and segmentation masks from an over-the-shoulder stereo camera and an eye-in-hand monocular camera. Uniquely, each task comes with an infinite supply of demos through the use of motion planners operating on a series of waypoints given during task creation time; enabling an exciting flurry of demonstration-based learning. RLBench has been designed with scalability in mind; new tasks, along with their motion-planned demos, can be easily created and then verified by a series of tools, allowing users to submit their own tasks to the RLBench task repository. This large-scale benchmark aims to accelerate progress in a number of vision-guided manipulation research areas, including: reinforcement learning, imitation learning, multi-task learning, geometric computer vision, and in particular, few-shot learning. With the benchmark's breadth of tasks and demonstrations, we propose the first large-scale few-shot challenge in robotics. We hope that the scale and diversity of RLBench offers unparalleled research opportunities in the robot learning community and beyond.
Type Of Material	Database/Collection of data
Year Produced	2019
Provided To Others?	Yes
Impact	It has a growing research community forming around it (>400 stars on github) and is being used to benchmark state-of-the-art methods that are submitted to conferences.
URL	http://www.imperial.ac.uk/dyson-robotics-lab/projects/rlbench/


Description	Dyson-DRL Partnership
Organisation	Dyson
Country	United Kingdom
Sector	Private
PI Contribution	The group is well-known for his work in computer vision and there is transfer of expertise in this area.
Collaborator Contribution	Dyson has a long successful history in developing innovative products for home. Their expertise and knowledge in this area enable us to further understand the limitation of computer vision approaches for practical purposes and enable us to direct our research towards more practical solutions.
Impact	The Dyson-DRL collaboration under the prosperity partnership award is only a few months old and there are not currently tangible outcomes.
Start Year	2019


Title	DeepFactors
Description	The ability to estimate rich geometry and camera motion from monocular imagery is fundamental to future interactive robotics and augmented reality applications. Different approaches have been proposed that vary in scene geometry representation (sparse landmarks, dense maps), the consistency metric used for optimising the multi-view problem, and the use of learned priors. We present a SLAM system that unifies these methods in a probabilistic framework while still maintaining real-time performance. This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors: photometric, reprojection and geometric, which we make use of within standard factor graph software. We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry.
Type Of Technology	Software
Year Produced	2020
Open Source License?	Yes
Impact	This software and the technical paper that describes it has been influential in the important area of combining factor graph estimation with deep learning methods. For instance it was discussed as an exciting new frontier technology in Prof Frank Dellaert's prestigious `Test of Time' presentation at the conference Robotics: Science and Systems 2020. https://roboticsconference.org/2020/program/testoftimeaward/index.html


Title	In-Place Scene Labelling and Understanding with Implicit Scene Representation
Description	We extend neural radiance fields (NeRF) to jointly encode semantics with appearance and geometry, so that complete and accurate 2D semantic labels can be achieved using a small amount of in-place annotations specific to the scene. The intrinsic multi-view consistency and smoothness of NeRF benefit semantics by enabling sparse labels to efficiently propagate.
Type Of Technology	Software
Year Produced	2021
Open Source License?	Yes
Impact	The software allows researchers to use the algorithm for further development and benchmark it against their own algorithms.


Description	Invited Speaker - "Robotics Today - A Series of Technical Talks"
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	To inform the audience about the research in our lab and to initiate discussion on the advances in computer vision algorithms that will enable the next generation of smart robots and devices to truly interact with their environments
Year(s) Of Engagement Activity	2020
URL	https://roboticstoday.github.io/


Description	Invited Speaker - BMVA 2022
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Professor Davison was invited to present the research direction of the lab in British Machine Vision Association 2022.
Year(s) Of Engagement Activity	2022
URL	https://britishmachinevisionassociation.github.io/


Description	Invited Speaker - BMVC
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Present the work of the lab and the capabilities required to enable the next generation of smart robotics.
Year(s) Of Engagement Activity	2020
URL	https://www.bmvc2020-conference.com/


Description	Invited Speaker - CVPR 2022
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	CVPR is one of the top Computer Vision and Pattern Recognition Conferences. Professor Davison presented the lab's research directions.
Year(s) Of Engagement Activity	2022


Description	Invited Speaker - ECCV 2022
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Professor Davison presented the work of the lab at an ECCV 2022 workshop. ECCV is one of the top computer vision conferences
Year(s) Of Engagement Activity	2022


Description	Invited Speaker - RSS 2021
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	Robotics Science and Systems (RSS) is one of the top conferences in robotics with over 600 participants from 40 difference countries. Professor Andrew Davison gave an invited talk in one of the workshops.
Year(s) Of Engagement Activity	2021


Description	Invited Speaker - Rank Prize Symposium on Neural Rendering 2022
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Professor Davison gave a presentation in the Rank Prize Symposium on Neural Rendering
Year(s) Of Engagement Activity	2022
URL	https://www.rankprize.org/symposia/


Description	Invited Speaker - University of Pennsilvania
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	A talk "Towards Graph-Based Spatial AI" to inform the audience about the research work of the lab and the advances in computer vision for robotics
Year(s) Of Engagement Activity	2020
URL	https://www.grasp.upenn.edu/events/andrew-davison-2020/


Description	Invited talk - CVPR2020 workshop
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	Inform the audience on the advances in computer vision algorithms for mobile robots.
Year(s) Of Engagement Activity	2020
URL	https://sites.google.com/view/vislocslamcvpr2020/home


Description	Invited talk - CogX on the Research Stage "Research: The Long View"
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	To inform the audience about the research work in the Dyson Robotics Lab
Year(s) Of Engagement Activity	2020


Description	Keynote Speaker - MRS conference
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Postgraduate students
Results and Impact	A keynote presentation of the lab current work.
Year(s) Of Engagement Activity	2021
URL	https://mrs2021.org/


Description	Presentation - Outreach Series Department of Computing
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Postgraduate students
Results and Impact	Professor Andrew Davison gave a presentation of the current work of the lab in SLAM and its evolution towards Spatial AI. This presentation was part of the Department of Computing outreach program.
Year(s) Of Engagement Activity	2021
URL	https://www.imperial.ac.uk/computing/outreach/outreach-news-and-events/

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications