CORSMAL: Collaborative object recognition, shared manipulation and learning

Lead Research Organisation: Queen Mary University of London

Department Name: Sch of Electronic Eng & Computer Science

Abstract

CORSMAL proposes to develop and validate a new framework for collaborative recognition and manipulation of objects via cooperation with humans. The project will explore the fusion of multiple sensing modalities (touch, sound and first/third person vision) to accurately and robustly estimate the physical properties of objects in noisy and potentially ambiguous environments. The framework will mimic human capability of learning and adapting across a set of different manipulators, tasks, sensing configurations and environments. In particular, we will address the problems of (1) learning shared autonomy models via observations of and interactions with humans and (2) generalising capabilities across tasks and sites by aggregating data and abstracting models to enable accurate object recognition and manipulation of unknown objects in unknown environments. The focus of CORSMAL is to define learning architectures for multimodal sensory data as well as for aggregated data from different environments. A key aim of the project is to identify the most suitable framework resulting from learning across environments and the optimal trade-off between the use of specialised local models and generalised global models. The goal here is to continually improve the adaptability and robustness of the models. The robustness of the proposed framework will be evaluated with prototype implementations in different environments. Importantly, during the project we will organise two community challenges to favour data sharing and support experiment reproducibility in additional sites.

Planned Impact

n/a

Funded Value:

£302,905

Funded Period:

Feb 19 - Dec 22

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/S031715/1

Principal Investigator:

Andrea Cavallaro

Research Subject:

Info. & commun. Technol. (85%)

Mechanical engineering (15%)

Research Topic:

Artificial Intelligence (15%)

Human-Computer Interactions (15%)

Image & Vision Computing (50%)

Music & Acoustic Technology (5%)

Robotics & Autonomy (15%)

Organisations

People	ORCID iD
Andrea Cavallaro (Principal Investigator)
Kaspar Althoefer (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Donaher S (2021) Audio classification of the content of food containers and drinking glasses

Modas A (2021) Improving Filling Level Classification with Adversarial Training

Modas A (2020) Toward Robust Sensing for Autonomous Vehicles: An Adversarial Perspective in IEEE Signal Processing Magazine

Oh C (2021) View-Action Representation Learning for Active First-Person Vision in IEEE Transactions on Circuits and Systems for Video Technology

Oh C (2021) OHPL: One-shot Hand-eye Policy Learner

Pang Y (2021) Towards safe human-to-robot handovers of unknown containers

Sanchez-Matilla R (2020) Benchmark for Human-to-Robot Handovers of Unseen Containers With Unknown Filling in IEEE Robotics and Automation Letters

Shahin Shamsabadi A (2020) ColorFool: Semantic Adversarial Colorization

Xompero A (2020) Multi-View Shape Estimation of Transparent Containers

Xompero A (2022) The CORSMAL Benchmark for the Prediction of the Properties of Containers in IEEE Access

Key Findings
Impact Summary
Research Databases and Models
Collaboration


Description	A method for the contactless estimation (through vision and sound signals) of the physical properties of objects manipulated by humans is important to inform the control of robots for performing accurate and safe grasps of objects handed over by humans. A real-to-simulation framework that integrates sensing data and a robotic arm simulator to complete the handover task and estimates the pose of the hand holding the container to help prevent an unsafe grasp [19]. The framework facilitates the development of algorithms for object properties estimation and robot planning to test methods for safe handovers and to enable progress when access to a robot is unavailable. The simulator was developed to mitigate the limited access to laboratories caused by COVID19 lockdowns and restrictions, and to facilitate the take-up of the CORSMAL dataset by a wider community. We demonstrated and validated the framework on the CORSMAL Containers Manipulation dataset using the CORSMAL vision-based baseline to estimate - online and without access to models or motion capture data - the shape and trajectory of a container. A new method for testing the robustness of machine-learning classifiers through adversarial attacks. The proposed method can generate perturbations on the data for images of any size, and outperforms five state-of-the-art attacks on two different tasks, scene and object classification, three state-of-the-art deep neural networks. A new method for localising, in 3D, container-like objects and estimating their dimensions using two wide-baseline, RGB cameras. A new method for training a method for filling level classificatin using transfer learning and adversarial training.
Exploitation Route	Through a benchmark that we designed and the open source code we distribute: https://corsmal.eecs.qmul.ac.uk/benchmark.html The data we have produced have been already used by research laboratories across the world (see the list of participants in https://corsmal.eecs.qmul.ac.uk/challenge.html)
Sectors	Digital/Communication/Information Technologies (including Software)
URL	https://corsmal.eecs.qmul.ac.uk/publications.html


Description	The organisation of the CORSMAL Challenge at IEEE ICASSP 2022, IEEE ICME 2020, at the Intelligent Sensing Summer School 2020, and at ICPR 2020, which had 30 participants. The leaderboard for the CORSMAL Challenge has accumulated 12 entries, of which 6 are results from teams and 6 are baselines. IET QMUL Children's Christmas Lecture 2019 - London, UK (11 December 2019) - Presentation of the tasks and objectives of CORSMAL to an audience of children, teachers and parents The publication by teams outside the CORSMAL project of six papers and one technical report on the methods developed for the CORSMAL Challenge.
First Year Of Impact	2000
Sector	Digital/Communication/Information Technologies (including Software)
Impact Types	Cultural


Title	CORSMAL Containers
Description	The dataset was acquired with two Intel RealSense D435i cameras, located approximately 40cm from the object placed on the top of a table. The cameras are calibrated and localised with respect to a calibration board. The resulting images (1280x720 pixels) are RGB, depth and stereo infrared, with the RGB and depth images being spatially aligned. Data acquisition was performed in two separate rooms with different lighting conditions and different backgrounds are obtained using two tablecloths in addition to a no-tabletop scenario. We employed a room with natural light. The dimensions of the table-top are 160x80 cm and the height of the table is 82 cm from the ground, and a room with no windows and the illumination is provided by either ceiling lights or additional portable lights and where the table is of size 60x60 cm and height 82 cm from the ground. We collected in total 207 configurations, as the combination of objects (23), backgrounds (3) and lighting conditions (3), resulting in 414 RGB images, 414 depth images and 828 IR images. We manually annotated the maximum width and height of each object with a digital caliper (0-150mm ± 0.01mm) and a measuring tape (0-10m ± 1mm).
Type Of Material	Database/Collection of data
Year Produced	2020
Provided To Others?	Yes
Impact	The dataset was used in 2 publications.
URL	http://corsmal.eecs.qmul.ac.uk/containers.html


Title	CORSMAL Containers Manipulation
Description	The dataset consists of multiple recordings of containers: drinking cups, drinking glasses and food boxes. These containers are made of different materials, such as plastic, glass and paper. Each container can be empty or filled with water, rice or pasta at two different levels of fullness. The multiple combination of containers and fillings results are acquired for three scenarios with an increasing level of difficulty, caused by occlusions or subject motions.
Type Of Material	Database/Collection of data
Year Produced	2020
Provided To Others?	Yes
Impact	Used in 2020 CORSMAL Challenge. Appeared already in 3 external publications.
URL	http://corsmal.eecs.qmul.ac.uk/containers_manip.html


Title	Trained models for filling level classification
Description	The networks are already pre-trained on the 3 splits (S1, S2, S3) of the C-CCM dataset, using six different training strategies. The networks are implemented in PyTorch. More information regarding the C-CCM dataset can be found here: https://corsmal.eecs.qmul.ac.uk/filling.html The CCM_Filling_Level_Pretrained_Models.zip file contains: 3 folders (S1, S2, S3) that correspond to the different dataset splits Each of S1, S2, S3 folders contains 6 subfolders (ST, AT, ST-FT, ST-AFT, AT-FT, AT-AFT) which correspond to the different training strategies used in the paper. Each of the ST, AT, ..., AT-AFT subfolders contains a PyTorch file named last.t7. This is the PyTorch ResNet-18 model that is trained on the corresponding split (S1/S2/S3) using the corresponding training strategy (ST, AT, ..., AT-AFT). A Python example script for loading the models is also provided (load_model.py).
Type Of Material	Computer model/algorithm
Year Produced	2021
Provided To Others?	Yes
Impact	Recently published
URL	https://zenodo.org/record/4518951#.YC9-z-qnw5k


Description	EPFL-corsmal
Organisation	Swiss Federal Institute of Technology in Lausanne (EPFL)
Country	Switzerland
Sector	Public
PI Contribution	research collaboration resulting in 3 joint publications.
Collaborator Contribution	expertise in robotic control, robotic manipulation, and robustness of machine learning models
Impact	A multi-disciplinary collaboration that resulted in 3 joint publications. Disciplines involved: robotics, control, machine learning, computer vision, digital signal processing.
Start Year	2019


Description	information fusion
Organisation	Swiss Federal Institute of Technology in Lausanne (EPFL)
Country	Switzerland
Sector	Public
PI Contribution	A new method for the real-time estimation through vision of the physical properties of objects manipulated by humans is important to inform the control of robots for performing accurate and safe grasps of objects handed over by humans.
Collaborator Contribution	The design of the control of a robot for performing accurate and safe grasps of objects handed over by humans.
Impact	multi-disciplinary collaboration - outcome: https://ieeexplore.ieee.org/document/8968407
Start Year	2019

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications