Understanding scenes and events through joint parsing, cognitive reasoning and lifelong learning
Lead Research Organisation:
University of Birmingham
Department Name: School of Computer Science
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
People |
ORCID iD |
Ales Leonardis (Principal Investigator) |
Description | The University of Birmingham contributed to a better understanding of the role of physics within computer vision. It has identified the advantages and disadvantages of different representations for scenes and scene understanding, in addition to different models for physical phenomena. It has developed methods for leveraging models of physics to enhance other computer vision tasks for which physics is not a primary factor. This research has resulted in interdisciplinary and international collaborations, in particular with MIT. Several strands of research within the project have provided improvements in real-world robotic manipulation. In addition to the focus on physics, the University of Birmingham has developed a number of state-of-the-art techniques and evaluation mechanisms for tasks with relevance to robotics, such as object pose estimation and tracking. These techniques and tools will have future impact as they are adopted by other researchers. Work on tracking evaluation methods has already impacted and continues to influence the research community. The objective of collaboration between research groups and between disciplines was impacted by the onset of the COVID pandemic. Collaborations were continued over online channels, but the inability to visit research labs due to national and international travel restrictions negatively affected the effectiveness and outputs of collaborations. |
Exploitation Route | The methods of leveraging physics for other computer vision tasks developed by the University of Birmingham have the potential to be incorporated into a broad range of computer vision tasks. We expect that these will be adopted by the wider research community for individual tasks as appropriate. The researchers from the University of Birmingham intend to build upon our findings on the role of scene representations and physics and their implications for other vision tasks to develop interpretable representations which have general utility across a range of vision tasks. |
Sectors | Digital/Communication/Information Technologies (including Software),Other |
Description | Amazon Research Awards |
Amount | $80,000 (USD) |
Organisation | Amazon.com |
Sector | Private |
Country | United States |
Start |
Description | Babymind: Computational Models Of Sensorimotor Learning And Control |
Amount | £24,930 (GBP) |
Organisation | Ministry of Science ICT and Future Planning |
Sector | Public |
Country | Korea, Republic of |
Start | 01/2020 |
End | 12/2020 |
Description | CHIST-ERA (Object recognition and manipulation by robots: Data sharing and experiment reproducibility (ORMR)) |
Amount | € 1,180,634 (EUR) |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 04/2019 |
End | 03/2022 |
Description | Paul and Yuanbi Ramsay Research Fund |
Amount | £2,719 (GBP) |
Organisation | University of Birmingham |
Sector | Academic/University |
Country | United Kingdom |
Start | 03/2017 |
End | 03/2017 |
Description | Tracking membrane receptor interactions via data-driven machine learning |
Amount | £119,134 (GBP) |
Organisation | Alan Turing Institute |
Sector | Academic/University |
Country | United Kingdom |
Start | 04/2020 |
End | 12/2021 |
Title | Multimodal visual and physical scene understanding data set |
Description | The data set consists of simulated scenes of complex collections of objects on tables. Data modalities include RGB images, depth images, 2D semantic maps, volumetric semantic maps, as well as object poses, physical interactions between objects, and object trajectories induced by physical perturbations to the scenes. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | No |
Impact | Mainly project research, and parts of data set have also been used by students working on other projects. |
Description | Collaboration with Josh Tenenbaum's group at MIT |
Organisation | Massachusetts Institute of Technology |
Country | United States |
Sector | Academic/University |
PI Contribution | Expertise on stability estimation and robust integration of stability estimation models into larger tasks, via collaboration on paper. |
Collaborator Contribution | Expertise on scene parsing and physical inference, via leading collaboration on paper. |
Impact | NeurIPS 2018 paper. |
Start Year | 2018 |
Description | Collaboration with Mario Fritz's group |
Organisation | Helmholtz Association of German Research Centres |
Country | Germany |
Sector | Academic/University |
PI Contribution | Expert input and guidance, data sets, and implementation of algorithm models. The collaboration resulted in multiple papers. |
Collaborator Contribution | Expert input and guidance, data sets, and implementation of algorithm models. The collaboration resulted in multiple papers. |
Impact | Papers: W Li, A Leonardis, M Fritz, Visual Stability Prediction for Robotic Manipulation, IEEE International Conference on Robotics and Automation M Wagner, H Basevi, R Shetty, W Li, M Malinowski, M Fritz, A Leonardis, Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions, European Conference on Computer Vision, 521-537 |
Start Year | 2016 |
Description | Collaboration with Mario Fritz's group |
Organisation | Max Planck Society |
Department | Max Planck Institute for Informatics |
Country | Germany |
Sector | Charity/Non Profit |
PI Contribution | Expert input and guidance, data sets, and implementation of algorithm models. The collaboration resulted in multiple papers. |
Collaborator Contribution | Expert input and guidance, data sets, and implementation of algorithm models. The collaboration resulted in multiple papers. |
Impact | Papers: W Li, A Leonardis, M Fritz, Visual Stability Prediction for Robotic Manipulation, IEEE International Conference on Robotics and Automation M Wagner, H Basevi, R Shetty, W Li, M Malinowski, M Fritz, A Leonardis, Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions, European Conference on Computer Vision, 521-537 |
Start Year | 2016 |
Title | Software platform for physically realistic scene data generation |
Description | A software tool for generation of physically realistic scenes and associated data. It physically simulates scene creation process using the Bullet physics engine, and applies a bespoke renderer to generate multimodal colour, surface normal, reflectance, and semantic information. The tool also permits more sophisticated physical understanding through generation of physical contact and force information. |
Type Of Technology | Software |
Year Produced | 2018 |
Impact | The tool has been used within the research group by PhD students within other projects, and is being adapted for use in the ongoing BURG robotic benchmarking project. |
Description | Article in Computer VIsion News Magazine |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | Article on research in Computer Vision News magazine, via their ECCV special issue. Read by the computer vision community and more widely (e.g. hobbyists etc.). |
Year(s) Of Engagement Activity | 2018 |
URL | http://www.rsipvision.com/ECCV2018-Monday/ |
Description | Talk at Amazon Research Awards Robotics Symposium 2019 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Hector Basevi attended the Amazon Research Awards Robotics Symposium 2019 in Boston, USA to present an extension of project research to constrained environments. The audience consisted of employees of Amazon and other invited Research groups, and the talk was later uploaded to Twitch.TV where it is publicly available. |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.twitch.tv/videos/511933761 |