📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

ROSSINI: Reconstructing 3D structure from single images: a perceptual reconstruction approach

Lead Research Organisation: University of Southampton
Department Name: Sch of Psychology

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Publications

10 25 50
 
Description Real world scene recognition and categorisation occurs rapidly for human perceivers, yet the mechanisms are poorly understood. Additionally, computer vision algorithms for estimating depth and segmenting a single monocular image accurately perform poorly with scene complexity / variability, primarily due to the paucity of training sets used. Here, we investigate both human and computer vision systems in a series of tasks related to estimating depth from a natural scene, and segmenting that scene accurately into category labels appropriate for a rich, diverse natural world.

The ability of human observers to identify and organise visual information into categories is a popular metric of scene recognition and understanding in behavioral and computational research. However, categorical constructs and their labels can be somewhat arbitrary. We have developed a new algorithm for describing human-centred categorisation of scene information, outperforming previous state-of-the-art descriptions in the literature. We go on to investigate the time-course of spatial and semantic scene perception, finding that contrary to space-centred theories of human vision, rather than using spatial layout to infer semantic categories, humans exploit semantic information to discriminate spatial structure categories. Together, these findings challenge traditional 'bottom-up' views of scene perception.

We made use of the SYNS dataset, a large repository of LiDAR and image data across a number of natural scene categories to evaluate state-of-the depth and semantic segmentation algorithms. In two monocular depth challenges, we demonstrate that, whilst current algorithms have shown vast improvements in their ability to accurately describe depth from a single monocular view, they perform less well when tasked with doing so across a variety of natural scenes, and for specific aspects of scenes such as assigning depth at object boundaries and accurately estimating metric depth. Our work has led to follow on research focussed on heuristics used by human perception may be able to inform and simplify the computational task.
Exploitation Route Yes. Understanding how human observers organise complex visual information will help researchers to better describe human perception, but will also enable computational models that use this information in processing natural scenes to become more efficient and effective.
Sectors Creative Economy

Digital/Communication/Information Technologies (including Software)

 
Title Dataset supporting the publication: The time-course of real-world scene perception: spatial and semantic processing 
Description This dataset is supporting the publication "The time-course of real-world scene perception: spatial and semantic processing". The data includes experimental data and key analyses (R and MATLAB scripts) that accompany the paper. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://eprints.soton.ac.uk/id/eprint/482269
 
Description Depth and scene gist 
Organisation York University Toronto
Country Canada 
Sector Academic/University 
PI Contribution A collaborative research project, I am conducting the research using the SYNS dataset that was created as a key outcome of the EPSRC grant
Collaborator Contribution Addition of expertise in stereo depth processing from Professor Laurie Wilcox
Impact None yet
Start Year 2016
 
Description BMVA workshop in London 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Around 60 people attended the workshop, whose theme was 3D reconstruction in both humans and machines. We had 2 international speakers.
Year(s) Of Engagement Activity 2020
URL https://britishmachinevisionassociation.github.io/meetings/20-01-29-3D%20worlds%20from%202D%20images...