Understanding scenes and events through joint parsing, cognitive reasoning and lifelong learning

Lead Research Organisation: University of Birmingham
Department Name: School of Computer Science

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
 
Description The University of Birmingham contributed to a better understanding of the role of physics within computer vision. It has identified the advantages and disadvantages of different representations for scenes and scene understanding, in addition to different models for physical phenomena. It has developed methods for leveraging models of physics to enhance other computer vision tasks for which physics is not a primary factor. This research has resulted in interdisciplinary and international collaborations, in particular with MIT. Several strands of research within the project have provided improvements in real-world robotic manipulation.

In addition to the focus on physics, the University of Birmingham has developed a number of state-of-the-art techniques and evaluation mechanisms for tasks with relevance to robotics, such as object pose estimation and tracking. These techniques and tools will have future impact as they are adopted by other researchers. Work on tracking evaluation methods has already impacted and continues to influence the research community.

The objective of collaboration between research groups and between disciplines was impacted by the onset of the COVID pandemic. Collaborations were continued over online channels, but the inability to visit research labs due to national and international travel restrictions negatively affected the effectiveness and outputs of collaborations.
Exploitation Route The methods of leveraging physics for other computer vision tasks developed by the University of Birmingham have the potential to be incorporated into a broad range of computer vision tasks. We expect that these will be adopted by the wider research community for individual tasks as appropriate. The researchers from the University of Birmingham intend to build upon our findings on the role of scene representations and physics and their implications for other vision tasks to develop interpretable representations which have general utility across a range of vision tasks.
Sectors Digital/Communication/Information Technologies (including Software),Other

 
Description Amazon Research Awards
Amount $80,000 (USD)
Organisation Amazon.com 
Sector Private
Country United States
Start  
 
Description Babymind: Computational Models Of Sensorimotor Learning And Control
Amount £24,930 (GBP)
Organisation Ministry of Science ICT and Future Planning 
Sector Public
Country Korea, Republic of
Start 01/2020 
End 12/2020
 
Description CHIST-ERA (Object recognition and manipulation by robots: Data sharing and experiment reproducibility (ORMR))
Amount € 1,180,634 (EUR)
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 04/2019 
End 03/2022
 
Description Paul and Yuanbi Ramsay Research Fund
Amount £2,719 (GBP)
Organisation University of Birmingham 
Sector Academic/University
Country United Kingdom
Start 03/2017 
End 03/2017
 
Description Tracking membrane receptor interactions via data-driven machine learning
Amount £119,134 (GBP)
Organisation Alan Turing Institute 
Sector Academic/University
Country United Kingdom
Start 04/2020 
End 12/2021
 
Title Multimodal visual and physical scene understanding data set 
Description The data set consists of simulated scenes of complex collections of objects on tables. Data modalities include RGB images, depth images, 2D semantic maps, volumetric semantic maps, as well as object poses, physical interactions between objects, and object trajectories induced by physical perturbations to the scenes. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact Mainly project research, and parts of data set have also been used by students working on other projects. 
 
Description Collaboration with Josh Tenenbaum's group at MIT 
Organisation Massachusetts Institute of Technology
Country United States 
Sector Academic/University 
PI Contribution Expertise on stability estimation and robust integration of stability estimation models into larger tasks, via collaboration on paper.
Collaborator Contribution Expertise on scene parsing and physical inference, via leading collaboration on paper.
Impact NeurIPS 2018 paper.
Start Year 2018
 
Description Collaboration with Mario Fritz's group 
Organisation Helmholtz Association of German Research Centres
Country Germany 
Sector Academic/University 
PI Contribution Expert input and guidance, data sets, and implementation of algorithm models. The collaboration resulted in multiple papers.
Collaborator Contribution Expert input and guidance, data sets, and implementation of algorithm models. The collaboration resulted in multiple papers.
Impact Papers: W Li, A Leonardis, M Fritz, Visual Stability Prediction for Robotic Manipulation, IEEE International Conference on Robotics and Automation M Wagner, H Basevi, R Shetty, W Li, M Malinowski, M Fritz, A Leonardis, Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions, European Conference on Computer Vision, 521-537
Start Year 2016
 
Description Collaboration with Mario Fritz's group 
Organisation Max Planck Society
Department Max Planck Institute for Informatics
Country Germany 
Sector Charity/Non Profit 
PI Contribution Expert input and guidance, data sets, and implementation of algorithm models. The collaboration resulted in multiple papers.
Collaborator Contribution Expert input and guidance, data sets, and implementation of algorithm models. The collaboration resulted in multiple papers.
Impact Papers: W Li, A Leonardis, M Fritz, Visual Stability Prediction for Robotic Manipulation, IEEE International Conference on Robotics and Automation M Wagner, H Basevi, R Shetty, W Li, M Malinowski, M Fritz, A Leonardis, Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions, European Conference on Computer Vision, 521-537
Start Year 2016
 
Title Software platform for physically realistic scene data generation 
Description A software tool for generation of physically realistic scenes and associated data. It physically simulates scene creation process using the Bullet physics engine, and applies a bespoke renderer to generate multimodal colour, surface normal, reflectance, and semantic information. The tool also permits more sophisticated physical understanding through generation of physical contact and force information. 
Type Of Technology Software 
Year Produced 2018 
Impact The tool has been used within the research group by PhD students within other projects, and is being adapted for use in the ongoing BURG robotic benchmarking project. 
 
Description Article in Computer VIsion News Magazine 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact Article on research in Computer Vision News magazine, via their ECCV special issue. Read by the computer vision community and more widely (e.g. hobbyists etc.).
Year(s) Of Engagement Activity 2018
URL http://www.rsipvision.com/ECCV2018-Monday/
 
Description Talk at Amazon Research Awards Robotics Symposium 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Hector Basevi attended the Amazon Research Awards Robotics Symposium 2019 in Boston, USA to present an extension of project research to constrained environments. The audience consisted of employees of Amazon and other invited Research groups, and the talk was later uploaded to Twitch.TV where it is publicly available.
Year(s) Of Engagement Activity 2019
URL https://www.twitch.tv/videos/511933761