BURG: Benchmarks for UndeRstanding Grasping
Lead Research Organisation:
University of Birmingham
Department Name: School of Computer Science
Abstract
Grasping rigid objects has been reasonably studied under a wide variety of settings. The common measure of success is a check of the robot to hold an object for a few seconds. This is not enough. To obtain a deeper understanding of object manipulation, we propose (1) a task-oriented part-based modelling of grasping and (2) BURG - our castle of setups, tools and metrics for community building around an objective benchmark protocol.
The idea is to boost grasping research by focusing on complete tasks. This calls for attention on object parts since they are essential to know how and where the gripper can grasp given the manipulation constraints imposed by the task. Moreover, parts facilitate knowledge transfer to novel objects, across different sources (virtual/real data) and grippers, providing for a versatile and scalable system. The part-based approach naturally extends to deformable objects for which the recognition of relevant semantic parts, regardless of the object actual deformation, is essential to get a tractable manipulation problem. Finally, by focusing on parts we can deal easier with environmental constraints that are detected and used to facilitate grasping.
Regarding benchmarking of manipulation, so far robotics suffered from incomparable grasping and manipulation work. Datasets cover only the object detection aspect. Object sets are difficult to get, not extendible, and neither scenes nor manipulation tasks are replicable. There are no common tools to solve the basic needs of setting up replicable scenes or reliably estimate object pose.
Hence, with the BURG benchmark we propose to focus on community building through enabling and sharing tools for reproducible performance evaluation, including collecting data and feedback from different laboratories for studying manipulation across different robot embodiments. We will develop a set of repeatable scenarios spanning different levels of quantifiable complexity that involve the choice of the objects, tasks and environments. Examples include fully quantified settings with layers of objects, adding deformable objects and environmental constraints. The benchmark will include metrics defined to assess the performance of both low-level primitives (object pose, grasp point and type, collision-free motion) as well as manipulation tasks (stacking, aligning, assembling, packing, handover, folding) requiring ordering as well as common sense knowledge for semantic reasoning.
The idea is to boost grasping research by focusing on complete tasks. This calls for attention on object parts since they are essential to know how and where the gripper can grasp given the manipulation constraints imposed by the task. Moreover, parts facilitate knowledge transfer to novel objects, across different sources (virtual/real data) and grippers, providing for a versatile and scalable system. The part-based approach naturally extends to deformable objects for which the recognition of relevant semantic parts, regardless of the object actual deformation, is essential to get a tractable manipulation problem. Finally, by focusing on parts we can deal easier with environmental constraints that are detected and used to facilitate grasping.
Regarding benchmarking of manipulation, so far robotics suffered from incomparable grasping and manipulation work. Datasets cover only the object detection aspect. Object sets are difficult to get, not extendible, and neither scenes nor manipulation tasks are replicable. There are no common tools to solve the basic needs of setting up replicable scenes or reliably estimate object pose.
Hence, with the BURG benchmark we propose to focus on community building through enabling and sharing tools for reproducible performance evaluation, including collecting data and feedback from different laboratories for studying manipulation across different robot embodiments. We will develop a set of repeatable scenarios spanning different levels of quantifiable complexity that involve the choice of the objects, tasks and environments. Examples include fully quantified settings with layers of objects, adding deformable objects and environmental constraints. The benchmark will include metrics defined to assess the performance of both low-level primitives (object pose, grasp point and type, collision-free motion) as well as manipulation tasks (stacking, aligning, assembling, packing, handover, folding) requiring ordering as well as common sense knowledge for semantic reasoning.
Planned Impact
N/A
Organisations
- University of Birmingham (Lead Research Organisation)
- French National Research Agency ANR (Co-funder)
- Italian Institute of Technology (Istituto Italiano di Tecnologia IIT) (Collaboration)
- Vienna University of Technology (Collaboration)
- Spanish National Research Council (CSIC) (Collaboration)
- Polytechnic University of Turin (Collaboration)
Publications

Alliegro A
(2022)
End-to-End Learning to Grasp via Sampling from Object Point Clouds

Alliegro A
(2022)
End-to-End Learning to Grasp via Sampling From Object Point Clouds
in IEEE Robotics and Automation Letters


Ani M
(2021)
Quantifying the Use of Domain Randomization

Ani M
(2021)
Quantifying the Use of Domain Randomization




Collins J
(2024)
RAMP: A Benchmark for Evaluating Robotic Assembly Manipulation and Planning
in IEEE Robotics and Automation Letters
Description | Grasping and manipulation of objects are fundamental skills for robots to interact with their environment and perform various tasks. The University of Birmingham has developed several methods, toolkits, and benchmarks related to grasping and manipulation of objects. These tools enable the creation of physically plausible virtual scenes for generating training data and grasping in simulation, the recreation of the scenes by arranging the objects accurately in the physical world for real robot experiments, and the sharing of the scenes with the community to foster comparability and reproducibility of experimental research. This research has resulted in international collaborations, in particular with the Vienna University of Technology and Politecnico di Torino. Many robot (and human) manipulation tasks involve making and breaking of contacts with objects and surfaces. These "change of contact" points are associated with spikes in forces that can damage the robot or the objects/surfaces. The University of Birmingham has developed a framework inspired by insight into human motor control that enables a robot manipulator to use very few examples to rapidly learn simple forward models that predict the forces likely to be experienced by the robot. Depending on the error between the predicted and measured forces during task performance, the robot is able to quickly adapt its movement to minimise the spikes, leading to smoother motion. We were able to demonstrate the framework's capabilities in simulation and on physical robot manipulators. The University of Birmingham also contributed to the development of RAMP, an open-source robot manipulation benchmark inspired by real-world industrial assembly tasks. RAMP poses tasks that require the robot to assemble beams into specified goal configurations using pegs as fasteners. It supports the assessment of capabilities related to open problems in perception, reasoning, manipulation, diagnostics, and fault recovery. RAMP has been designed to be accessible and extensible. Parts are either 3D printed or otherwise constructed from materials that are readily obtainable, based on detailed instructions. RAMP also allows researchers to focus on individual sub-tasks of the assembly challenge if desired. Overall, we provide a full digital twin as well as simple baselines to enable rapid progress, and to support a community-driven endeavour that evolves as capability matures. |
Exploitation Route | The methods related to grasping and manipulation of objects, together with the corresponding toolkits and benchmarks developed by the University of Birmingham have the potential to be used in a broad range of robotic applications, such as industrial manipulation, service robotics, and assistive robotics. We expect that these tools will be adopted by the wider research community, as they facilitate development of the new and domain specific methods through simulation, generation of training data, planning of real robot experiments and fostering comparability and reproducibility of experimental research. |
Sectors | Digital/Communication/Information Technologies (including Software) Other |
Title | Grasping dataset YCB-76 |
Description | This is a dataset for evaluation of grasping from object point clouds. It contains 76 objects from the renowned YCB object set, which are arranged in 259 distinct grasping scenarios. For each scenario, we provide point clouds from synthetically created depth images. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | This dataset has been used throughout our collaboration "end-to-end learning to grasp from object point clouds". It is suitable for evaluation purposes and specifically for analysing the capabilities of learning-based grasping methods to generalise to novel, unseen shapes. The performance is evaluated using simulation-based success rates. |
URL | https://github.com/antoalli/L2G |
Title | Grasping dataset YCB-8 |
Description | This dataset is a synthetic dataset for robotic grasping from point clouds. It is made using 8 objects from the renowned YCB object set and contains 15 grasping scenarios with each object being in a certain resting pose. For each scenario, we provide point clouds from 300 different, synthetically rendered depth images as well as 100k grasp annotations. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | This dataset has been used throughout our collaboration "end-to-end learning to grasp from object point clouds". It is suitable for evaluation purposes and specifically for analysing the capabilities of learning-based grasping methods to generalise to novel, unseen shapes. We can produce simulation-based success measures and, using the grasp annotations, also measures like coverage, i.e. which proportion of the annotated grasps could be reproduced by the method. |
URL | https://github.com/antoalli/L2G |
Description | End-to-end learning to grasp from point clouds |
Organisation | Italian Institute of Technology (Istituto Italiano di Tecnologia IIT) |
Country | Italy |
Sector | Academic/University |
PI Contribution | In this collaboration we are investigating the problem of grasping objects with a 2-finger parallel-jaw gripper. More specifically, the core of the problem is to identify suitable 6-DOF grasp poses based on a partial point cloud of the object obtained from a single-view depth image, for which we develop novel Deep-Learning-based methods. We contributed to the conceptual development of the method (e.g. used representations, network architecture) with our expertise in robotic grasping and 6-DOF object pose estimation, which is a related problem that often utilises similar Deep Learning methods based on point cloud representations. We created datasets that can be used for training as well as for simulation-based evaluation of the proposed grasps. In connection with the datasets, we also contributed tools to evaluate individual grasps in a physics-based simulation environment as well as tools and metrics for thorough evaluation of sets of grasp predictions. Furthermore, we implemented an experimental setup for real robot grasping trials and conducted the required experiments. In this collaboration, we also made use of equipment (e.g. robot, sensors, test objects, GPUs) in our robotics lab which has been acquired from earlier funding (not within this grant). |
Collaborator Contribution | Our partners have expertise in Deep Learning on point clouds, specifically encoding local as well as global object features, which has proven to be valuable for tasks such as shape completion (i.e. estimation of the unobserved part of the object). We consider this very valuable also in the context of robotic grasping. Besides the conceptual development of the method, our partners also contributed by implementing the model, designing experiments and ablation studies, and conducting the majority of simulation-based experiments using their high-performance computing equipment. |
Impact | The outcomes are partially published. Already published and described in the corresponding sections are: - Datasets: YCB-8, YCB-76 - Software/Tools: Grasp Evaluator Published by our collaborators: - Software: L2G (Learning to Grasp) method, central entry point to this work, also references all relevant tools and datasets; see https://github.com/antoalli/L2G Submitted but not published: - Paper "End-to-End Learning to Grasp from Object Point Clouds" submitted to RA-L/IROS This collaboration is not multi-disciplinary. |
Start Year | 2020 |
Description | End-to-end learning to grasp from point clouds |
Organisation | Polytechnic University of Turin |
Country | Italy |
Sector | Academic/University |
PI Contribution | In this collaboration we are investigating the problem of grasping objects with a 2-finger parallel-jaw gripper. More specifically, the core of the problem is to identify suitable 6-DOF grasp poses based on a partial point cloud of the object obtained from a single-view depth image, for which we develop novel Deep-Learning-based methods. We contributed to the conceptual development of the method (e.g. used representations, network architecture) with our expertise in robotic grasping and 6-DOF object pose estimation, which is a related problem that often utilises similar Deep Learning methods based on point cloud representations. We created datasets that can be used for training as well as for simulation-based evaluation of the proposed grasps. In connection with the datasets, we also contributed tools to evaluate individual grasps in a physics-based simulation environment as well as tools and metrics for thorough evaluation of sets of grasp predictions. Furthermore, we implemented an experimental setup for real robot grasping trials and conducted the required experiments. In this collaboration, we also made use of equipment (e.g. robot, sensors, test objects, GPUs) in our robotics lab which has been acquired from earlier funding (not within this grant). |
Collaborator Contribution | Our partners have expertise in Deep Learning on point clouds, specifically encoding local as well as global object features, which has proven to be valuable for tasks such as shape completion (i.e. estimation of the unobserved part of the object). We consider this very valuable also in the context of robotic grasping. Besides the conceptual development of the method, our partners also contributed by implementing the model, designing experiments and ablation studies, and conducting the majority of simulation-based experiments using their high-performance computing equipment. |
Impact | The outcomes are partially published. Already published and described in the corresponding sections are: - Datasets: YCB-8, YCB-76 - Software/Tools: Grasp Evaluator Published by our collaborators: - Software: L2G (Learning to Grasp) method, central entry point to this work, also references all relevant tools and datasets; see https://github.com/antoalli/L2G Submitted but not published: - Paper "End-to-End Learning to Grasp from Object Point Clouds" submitted to RA-L/IROS This collaboration is not multi-disciplinary. |
Start Year | 2020 |
Description | SetupTool for benchmarking robotic grasping |
Organisation | Spanish National Research Council (CSIC) |
Country | Spain |
Sector | Public |
PI Contribution | In this collaboration, we created a tool for arranging scenes for both simulated and real experiments. An intuitive GUI allows the arrangement of the 3d object models, while a physical simulation engine ensures physical plausibility of the scene. This can be used both in grasp simulations and to create printouts indicating the poses of the objects for arranging the real objects. We built upon our expertise in robotic grasping of rigid objects and worked on the design and implementation of the back-end of the software, which provides the core functionalities in the form of a Python package. |
Collaborator Contribution | Our partners at TU Vienna brought in their experience in building visual tools for e.g. labeling of object poses. They designed and implemented the front-end of the software, which handles all user interaction and allows intuitive arrangement of the objects. Our partners at CSIC Barcelona have expertise in grasping deformables, in particular cloth-like objects. They provided intellectual input on how to integrate such deformable objects into the SetupTool and provided valuable feedback throughout the development and testing stages. |
Impact | Outcomes that are published by us and described in the corresponding sections are: - Software: BURG Toolkit as back-end of the SetupTool Outcomes that are published by our partners: - Software: SetupTool GUI as front-end of the SetupTool This collaboration is not multi-disciplinary. |
Start Year | 2020 |
Description | SetupTool for benchmarking robotic grasping |
Organisation | Vienna University of Technology |
Country | Austria |
Sector | Academic/University |
PI Contribution | In this collaboration, we created a tool for arranging scenes for both simulated and real experiments. An intuitive GUI allows the arrangement of the 3d object models, while a physical simulation engine ensures physical plausibility of the scene. This can be used both in grasp simulations and to create printouts indicating the poses of the objects for arranging the real objects. We built upon our expertise in robotic grasping of rigid objects and worked on the design and implementation of the back-end of the software, which provides the core functionalities in the form of a Python package. |
Collaborator Contribution | Our partners at TU Vienna brought in their experience in building visual tools for e.g. labeling of object poses. They designed and implemented the front-end of the software, which handles all user interaction and allows intuitive arrangement of the objects. Our partners at CSIC Barcelona have expertise in grasping deformables, in particular cloth-like objects. They provided intellectual input on how to integrate such deformable objects into the SetupTool and provided valuable feedback throughout the development and testing stages. |
Impact | Outcomes that are published by us and described in the corresponding sections are: - Software: BURG Toolkit as back-end of the SetupTool Outcomes that are published by our partners: - Software: SetupTool GUI as front-end of the SetupTool This collaboration is not multi-disciplinary. |
Start Year | 2020 |
Title | BURG Toolkit |
Description | The BURG Toolkit is a Python package for Benchmarking and Understanding Robotic Grasping. The main features supported by the toolkit are: (i) core data structures for object libraries, scenes, grippers, grasps, and other fundamental constructs related to grasping; (ii) physical grasp simulation using the PyBullet physics engine; (iii) ability to provide printouts that can be used to arrange objects in the physical world in configurations that match those in the simulated environment (for experimental evaluation); (iv) ability to create datasets, especially sampling grasps based on 3D object model and rendering of (depth) images of scenes; and (v) visualisation of scenes and grasps for experimental evaluation. The toolkit can be used as a stand-alone Python package or with the BURG Toolkit GUI as an User Interface, which has been developed by project collaborators from TU Vienna. |
Type Of Technology | Software |
Year Produced | 2022 |
Open Source License? | Yes |
Impact | The software has just been released so there are no impacts to report yet. We plan to promote the software at academic conferences and in our research networks. |
URL | https://github.com/mrudorfer/burg-toolkit |
Title | Grasp Evaluator |
Description | The grasp evaluator is a publicly available Python package for evaluating the grasps provided by different methods for estimating grasps. It is designed to work with multiple existing grasping datasets, including our own YCB-8 and YCB-76 datasets. It provides a variety of metrics and measures for evaluation that are based on simulation as well as comparison (if grasp annotations are available in the target dataset). |
Type Of Technology | Software |
Year Produced | 2022 |
Impact | The grasp evaluator supports a detailed analysis of grasp predictions, which helps provide a better understanding of the strengths and weaknesses of state of the art grasping methods. It also helps improve existing methods as well as develop new methods with the desired capabilities. This software has been used for the duration of our collaborative effort: "end-to-end learning to grasp from object point clouds", and it played an important role in enabling the related scientific contributions. It is publicly available on GitHub to encourage other researchers to use it as well. |
URL | https://github.com/mrudorfer/grasp-evaluator |