CORSMAL: Collaborative object recognition, shared manipulation and learning

Lead Research Organisation: Queen Mary, University of London
Department Name: Sch of Electronic Eng & Computer Science

Abstract

CORSMAL proposes to develop and validate a new framework for collaborative recognition and manipulation of objects via cooperation with humans. The project will explore the fusion of multiple sensing modalities (touch, sound and first/third person vision) to accurately and robustly estimate the physical properties of objects in noisy and potentially ambiguous environments. The framework will mimic human capability of learning and adapting across a set of different manipulators, tasks, sensing configurations and environments. In particular, we will address the problems of (1) learning shared autonomy models via observations of and interactions with humans and (2) generalising capabilities across tasks and sites by aggregating data and abstracting models to enable accurate object recognition and manipulation of unknown objects in unknown environments. The focus of CORSMAL is to define learning architectures for multimodal sensory data as well as for aggregated data from different environments. A key aim of the project is to identify the most suitable framework resulting from learning across environments and the optimal trade-off between the use of specialised local models and generalised global models. The goal here is to continually improve the adaptability and robustness of the models. The robustness of the proposed framework will be evaluated with prototype implementations in different environments. Importantly, during the project we will organise two community challenges to favour data sharing and support experiment reproducibility in additional sites.

Planned Impact

n/a

Publications

10 25 50
publication icon
Modas A (2020) Toward Robust Sensing for Autonomous Vehicles: An Adversarial Perspective in IEEE Signal Processing Magazine

publication icon
Oh C (2021) View-Action Representation Learning for Active First-Person Vision in IEEE Transactions on Circuits and Systems for Video Technology

publication icon
Sanchez-Matilla R (2020) Benchmark for Human-to-Robot Handovers of Unseen Containers With Unknown Filling in IEEE Robotics and Automation Letters

 
Description A method for the real-time estimation through vision of the physical properties of objects manipulated by humans is important to inform the control of robots for performing accurate and safe grasps of objects handed over by humans.

A real-to-simulation framework that integrates sensing data and a robotic arm simulator to complete the handover task and estimates the pose of the hand holding the container to help prevent an unsafe grasp [19]. The framework facilitates the development of algorithms for object properties estimation and robot planning to test methods for safe handovers and to enable progress when access to a robot is unavailable. The simulator was developed to mitigate the limited access to laboratories caused by COVID19 lockdowns and restrictions, and to facilitate the take-up of the CORSMAL dataset by a wider community. We demonstrated and validated the framework on the CORSMAL Containers Manipulation dataset using the CORSMAL vision-based baseline to estimate - online and without access to models or motion capture data - the shape and trajectory of a container.

A new method for testing the robustness of machine-learning classifiers through adversarial attacks. The proposed method can generate perturbations on the data for images of any size, and outperforms five state-of-the-art attacks on two different tasks, scene and object classification, three state-of-the-art deep neural networks.

A new method for localising, in 3D, container-like objects and estimating their dimensions using two wide-baseline, RGB cameras.
Exploitation Route Through a benchmark that we designed and the open source code we distribute: https://corsmal.eecs.qmul.ac.uk/benchmark.html
Sectors Digital/Communication/Information Technologies (including Software)

 
Description 11 December 2019 IET QMUL Children's Christmas Lecture 2019 - London, UK Presentation of the tasks and objectives of CORSMAL to an audience of children, teachers and parents The publication by teams outside the CORSMAL project of three papers and one technical report on the methods developed for the CORSMAL Challenge. The organisation of the CORSMAL Challenge at ICME 2020, at the Intelligent Sensing Summer School 2020, and at ICPR 2020, which had 30 participants. The leaderboard for the CORSMAL Challenge has accumulated 12 entries, of which 6 are results from teams and 6 are baselines.
First Year Of Impact 2000
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Cultural

 
Title CORSMAL Containers 
Description The dataset was acquired with two Intel RealSense D435i cameras, located approximately 40cm from the object placed on the top of a table. The cameras are calibrated and localised with respect to a calibration board. The resulting images (1280x720 pixels) are RGB, depth and stereo infrared, with the RGB and depth images being spatially aligned. Data acquisition was performed in two separate rooms with different lighting conditions and different backgrounds are obtained using two tablecloths in addition to a no-tabletop scenario. We employed a room with natural light. The dimensions of the table-top are 160x80 cm and the height of the table is 82 cm from the ground, and a room with no windows and the illumination is provided by either ceiling lights or additional portable lights and where the table is of size 60x60 cm and height 82 cm from the ground. We collected in total 207 configurations, as the combination of objects (23), backgrounds (3) and lighting conditions (3), resulting in 414 RGB images, 414 depth images and 828 IR images. We manually annotated the maximum width and height of each object with a digital caliper (0-150mm ± 0.01mm) and a measuring tape (0-10m ± 1mm). 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact The dataset was used in 2 publications. 
URL http://corsmal.eecs.qmul.ac.uk/containers.html
 
Title CORSMAL Containers Manipulation 
Description The dataset consists of multiple recordings of containers: drinking cups, drinking glasses and food boxes. These containers are made of different materials, such as plastic, glass and paper. Each container can be empty or filled with water, rice or pasta at two different levels of fullness. The multiple combination of containers and fillings results are acquired for three scenarios with an increasing level of difficulty, caused by occlusions or subject motions. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
Impact Used in 2020 CORSMAL Challenge. Appeared already in 3 external publications. 
URL http://corsmal.eecs.qmul.ac.uk/containers_manip.html
 
Title Trained models for filling level classification 
Description The networks are already pre-trained on the 3 splits (S1, S2, S3) of the C-CCM dataset, using six different training strategies. The networks are implemented in PyTorch. More information regarding the C-CCM dataset can be found here: https://corsmal.eecs.qmul.ac.uk/filling.html The CCM_Filling_Level_Pretrained_Models.zip file contains: 3 folders (S1, S2, S3) that correspond to the different dataset splits Each of S1, S2, S3 folders contains 6 subfolders (ST, AT, ST-FT, ST-AFT, AT-FT, AT-AFT) which correspond to the different training strategies used in the paper. Each of the ST, AT, ..., AT-AFT subfolders contains a PyTorch file named last.t7. This is the PyTorch ResNet-18 model that is trained on the corresponding split (S1/S2/S3) using the corresponding training strategy (ST, AT, ..., AT-AFT). A Python example script for loading the models is also provided (load_model.py). 
Type Of Material Computer model/algorithm 
Year Produced 2021 
Provided To Others? Yes  
Impact Recently published 
URL https://zenodo.org/record/4518951#.YC9-z-qnw5k
 
Description EPFL-corsmal 
Organisation Swiss Federal Institute of Technology in Lausanne (EPFL)
Country Switzerland 
Sector Public 
PI Contribution research collaboration resulting in 3 joint publications.
Collaborator Contribution expertise in robotic control, robotic manipulation, and robustness of machine learning models
Impact A multi-disciplinary collaboration that resulted in 3 joint publications. Disciplines involved: robotics, control, machine learning, computer vision, digital signal processing.
Start Year 2019
 
Description information fusion 
Organisation Swiss Federal Institute of Technology in Lausanne (EPFL)
Country Switzerland 
Sector Public 
PI Contribution A new method for the real-time estimation through vision of the physical properties of objects manipulated by humans is important to inform the control of robots for performing accurate and safe grasps of objects handed over by humans.
Collaborator Contribution The design of the control of a robot for performing accurate and safe grasps of objects handed over by humans.
Impact multi-disciplinary collaboration - outcome: https://ieeexplore.ieee.org/document/8968407
Start Year 2019