Learning to see in depth: neural models of binocular stereopsis

Lead Research Organisation: University of Essex
Department Name: Psychology

Abstract

When you go to see a 3D movie, you are provided with a pair of glasses to be worn while in the cinema. You then experience a vivid awareness of the three-dimensional shape of the objects and people in the movie, which appear to leave the screen and occupy space within the room. This occurs because the glasses allow the cinema to present two slightly different versions of the movie to your left and right eye.

The purpose of this proposal is to determine how your brain is able to interpret these differences between what is seen by your two eyes. We know that this is achieved by neurons in the brain that respond to these differences. We will establish how it is able to interpret these, to provide you with a clear appreciation of the three-dimensional shape of objects.

This research is important in providing a good theoretical understanding of binocular vision, so that we may develop successful therapies for conditions such as amblyopia and strabismus ("lazy eye" and "squint") in which binocular depth perception may be impaired or absent.

It is also important in enabling designers of movies and virtual reality systems to make the results as "real" as possible. This research will help us to understand the binocular image differences that our brains respond to, and how it uses these to determine three-dimensional shape. This can then be used to directly inform the design of virtual reality systems.
Finally, it is important in allowing engineers to develop robots and vehicles that can "see"; what we learn about how this is achieved by our brains is very valuable in developing artificial, computer vision.

Technical Summary

Binocular vision provides important information allowing us to see in depth. This information is encoded by binocular neurons in the primary visual cortex. We know, however, that depth perception does not arise in any simple manner from the activity in these disparity-tuned neurons. Rather, the perception of depth involves complex exitatory and inhibitory interactions between these neurons, and activity in extrastriate visual areas.

This project will develop models of this processing, based on Independent Components Analysis and the spectral properties of binocular images. We will create a novel set of binocular images with ground-truth data with which to develop and test these models. Our objectives in developing these models are twofold. First, they will provide a theoretical framework with which to understand binocular visual processing in the brain, beyond the initial encoding stage. Second, they will be used to develop novel, biologically-inspired algorithms for binocular stereopsis.

Planned Impact

The research has significant potential for impact within and beyond academia:

Academia

Vision Science: The results will be of significance to those studying binocular vision from behavioural and physiological perspectives.

Computer Vision: The development of artificial vision systems has great potential to benefit from our understanding of how our (highly successful) biological visual system operates.

Optometry: A thorough understanding of binocular vision provides the necessary science underpinning the development of therapeutic interventions in cases in which this is impaired.

Business/Industry
The development of biologically inspired stereo algorithms will benefit technicians working in computer vision. Basing our algorithms on state-of-the-art knowledge of neural processing of binocular information will provide new ways of thinking to this field.

The General Public

The recent resurgence of 3D movies means that there is significant public interest in binocular depth perception. There is therefore a demand from the general public to understand the science of binocular depth perception, and an opportunity to use this as an engaging way to present neuroscience.

Pathways to Impact

Hibbard, together with colleagues at other Scottish unviersities, is currently developing the Scottish Vision Group website. This will provide information, and a forum for discussion for a variety of audiences. The first is the general public. We will provide accessible summaries of the state of knowledge of vision, in the context of (i) fundamental science (ii) understanding of disease and sensory impairment and (iii) technology. The second audience is industry and the third-sector; by bringing together the expertise available throughout Scotland in one place, we will make is easy to establish contacts with companies and charities would could benefit from out work. This will allow us to develop our current links (e.g. with the Fife Society for the Blind, and NCR). The third audience is each other, and academics more generally; the website hosts blogs and discussion groups to facilitate networking across the community.
A prototype of the website has been built, and is hosted on servers at St Andrews; the site is scheduled to be made publically available at the end of 2012.

Academic work will be presented at a variety of conferences in neuroscience and machine vision, to ensure impact in both of these fields. Interactive presentations to the general public will be made at the annual science fair hosted by the University of St Andrews.

People

The researcher employed on the project will develop considerable skills that are of value beyond the immediate project. This includes the application of advanced programming skills to a difficult problem in biology, and the analytical, writing and presentation skills required to present the results of the research at conferences, in journal papers, and to the general public.
 
Title The Mystery of the Raddlesham Mumps 
Description The Mystery of the Raddlesham Mumps is a poem written by Murray Lachlan Young (Poet in Residence at BBC 6 Music) which has also been developed into a play. We (myself and Loes van Dam and Liam Jarvis, University of Essex) have worked with the creative team responsible for this to help them design a virtual reality application to complement the project. 
Type Of Art Artefact (including digital) 
Year Produced 2019 
Impact To date, our expertise has helped shaped the VR experience, and to ensure its safe development (taking account for example of the differences in binocualr vision between adults and children). The play will be performed from April 2019, and we will carry out additional research to understand the contribution of the VR app to the experience. This is directly related to the objective of the award to 'directly inform the design of virtual reality systems.' 
URL http://raddleshammumps.co.uk/
 
Description 1. Image database. We have devised an improved methodology in which, in additional to capturing 3D objects, we have produced software for creating scenes. The advantage of this approach is that it removes many of the difficulties faced in calibration, and allows great flexibility in generating images. For example, it allows for the parametric variation of viewing parameters (e.g vergence) while viewing the same scene - something that would have been beyond our original approach. We have a growing database and the results have been used in our own research. The final version was presented at the Vision Sciences Society meeting in May 2016.
2. Independent Component Analysis. The first part of this has been completed and published in Journal of Vision. An extension was presented at the Vision Sciences Society in May 2014 (to be published in Journal of Vision), and has now been completed for submission.
3. In addition to performing Independent Component Analysis, we have also applied Independent Subspace Analysis. This works towards our goal of understanding binocular encoding beyond the initial stage of individual filters, with a view to establish the necessary invariances (e.g. to luminance phase) to create reliable disparity sensitivity. This has been published in PLOS ONE.
4. We have published additional work, modelling the responses of cells higher in the processing pathway (compared with the modelling included in the papers above) in Vision Research.
5. We have developed a computational model which extracts relevant depth structure from binocular information. This neural model of binocular depth perception directly links the outputs of early stages of processing in the brain to the perception of depth. In doing so, it provides a theoretically-driven (rather than data fitting) approach to understanding the activity of cells higher in the visual brain. This work was presented at the European Conference on Visual Perception in late 2016; the manuscript is in preparation. We have also performed novel experiments with human observers to test this model.
6. We have developed a model of the early stages of binocular vision, based on the idea that information should be encoded as efficiently as possible. This model provides a theoretical explanation of the properties of neurons in the early visual cortex. We have performed novel experiments with human observers to test this model.
7. Analysis of complex 3D models with ground truth. Having developed a library of 3D scans of natural objects, we have performed statistical analysis of the 3D structure of these objects. This allows us to understand the complex 3D structure that needs to be encoded by the visual system. In the same way that the simple distributions of depth and binocular disparity have allowed us to understand the early encoding of binocular information, this analysis helps us to understand the way in which more complex shape properties are also encoded at higher stages in the brain. 8. We have used these to show have luminance contributes to the perception of depth locally, but not across space within images.
Exploitation Route We have made code and images available, and these have the potential for impact in computer vision and the creative industries.
Sectors Aerospace, Defence and Marine,Digital/Communication/Information Technologies (including Software),Culture, Heritage, Museums and Collections

 
Description The research findings have been presented in an public lecture and discussion. The aim of this was to educate, inform and engage with the general public. We also have an ongoing programme of engaging with local business around 3D creative industries, and are currently exploring future collaboration around this. The work on complex 3D scenes, and the development of a 3D database, and methods to present these on 3D displays and in virtual reality, has opened a novel research area, at a time when consumer VR is rapidly developing. This has allowed us to secure funding from Facebook Oculus, one of the leading manufacturers of consumer virtual reality, in order to work directly with them in understanding how perception science can be used to optimise the VR experience. This work has received TV coverage. I have developed a website whose primary aim is to explain the research area in an accessible and engaging way. A number of new projects in the Digital Creative Sector have developed from the project. The first is involvement in the development of a virtual reality application to complement a multimedia (poetry, audio, play, VR) arts project (The Mystery of the Raddlesham Mumps). The second is that we have secured funding to develop augmented reality to aid with cortical sight loss. We completed a Military of Defence Defence and Security Accelerator project to develop machine learning for automatic image segmentation. The work from the BBSRC project is critical to this, as the database of segmented scenes of real world objects will be used as the training database. The outcomes of this were successful in that we provided a network model that was able to accurately segment scenes and estimate depth. The Defence and Security Technology Laboratory launched a new funding scheme based on this successful project, with funding for both our research group and by SeeByte.
Sector Aerospace, Defence and Marine,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Healthcare,Leisure Activities, including Sports, Recreation and Tourism,Culture, Heritage, Museums and Collections
Impact Types Cultural

 
Description Augmented Reality for Visual Impairment
Amount £36,660 (GBP)
Funding ID POC008 
Organisation Eastern ARC 
Sector Academic/University
Country United Kingdom
Start 05/2019 
End 04/2020
 
Description DSTL EO Image Processing: Robust Deep Networks for Depth Sensing and Scene Understanding
Amount £92,514 (GBP)
Organisation Ministry of Defence (MOD) 
Sector Public
Country United Kingdom
Start 11/2021 
End 04/2022
 
Description Deep learning for depth-based image segmentation
Amount £98,251 (GBP)
Funding ID ACC6012885 
Organisation Ministry of Defence (MOD) 
Sector Public
Country United Kingdom
Start 04/2020 
End 12/2020
 
Description Knowledge Transfer Partnership
Amount £201,108 (GBP)
Funding ID 10027152 
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start  
 
Description Oculus Call for Research
Amount $86,981 (USD)
Organisation Facebook 
Sector Private
Country United States
Start 07/2017 
End 07/2018
 
Description Research Project Grant
Amount £71,714 (GBP)
Funding ID RPG-2016-361 
Organisation The Leverhulme Trust 
Sector Charity/Non Profit
Country United Kingdom
Start 10/2017 
End 09/2020
 
Description Efficient encoding of binocular disparity 
Organisation McGill University
Country Canada 
Sector Academic/University 
PI Contribution This is an ongoing research collaboration between Fred Kingdom (McGill) Keith May (Essex) and myself to understand to broaden our understanding of the efficient encoding of binocular information. We have contributed to the design of novel experiments, to computer simulations of neural responses (making direct use of work from BB/K018973/1 Learning to see in depth: neural models of binocular stereopsis) and to preparation of manuscripts for publication.
Collaborator Contribution Fred Kingdom (McGill) has led this international collaboration, reaching out to draw on the skills and knowledge of myself and Keith May, which are partly founded on my work on (BB/K018973/1 Learning to see in depth: neural models of binocular stereopsis).
Impact Kingdom F May K and Hibbard P (2017) Stereoscopic depth perception is differentially affected by adaptation to binocularly correlated versus binocularl yanti-correlated noise, European Conference on Visual Perception
Start Year 2017
 
Description Integrating the senses in virtual reality 
Organisation Facebook
Country United States 
Sector Private 
PI Contribution This is also listed under further funding. This is a collaboration with Facebook Oculus (with Loes van Dam (PI) and Peter Scarfe) to fundamental research underpinning the development of consumer virtual reality.
Collaborator Contribution Facebook Oculus contribute funding (for staff equipment travel and consumables) and regular research meetings to guide the development of the work.
Impact Presentations at the Vision Sciences Annual Meeting (May 2018) and visits to Oculus (later in 2018) are planned.
Start Year 2017
 
Title Source code for Binocular Independent Component Analsyis 
Description Source code and data of the analysis performed in the following publication: Hunter, D.W. & Hibbard, P.B. (2015) Distribution of independent components of binocular natural images, Journal of Vision, 15(13):6, 1-31, doi:10.1167/15.13.6 
Type Of Technology Webtool/Application 
Year Produced 2015 
Impact Too early to say - only completed recently. 
URL https://github.com/DavidWilliamHunter/Bivis
 
Description Colchester Ultrafast Broadband launch 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact The event was to launch new initiatives around ultrafast broadband in Colchester. The activities were to showcase digital business and research in the local area. We used this to develop links with a number of local businesses, and are currently actively pursuing potential collaborations, with the Univeristy of Essex Research and Entrerprise Office. We have also been invited back for additional meetings, with a view to helping to direct local funding in this area.
Year(s) Of Engagement Activity 2017
URL http://www.colchesterultraready.com
 
Description Hibbard Website 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Website developed to explain research in an accessible way so as to engage with the general public
Year(s) Of Engagement Activity 2016
URL http://www.paulhibbard.org
 
Description Public Lecture 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact A public lecture in which I presented my research findings. This was followed by question and answer session. We also gave out free virtual reality headsets to encourage people to think about the importance of their 3D vision, and the importance of this emerging technology.
Year(s) Of Engagement Activity 2016
URL http://www.essex.ac.uk/events/event.aspx?e_id=9769
 
Description Workshop to establish links between researchers working in biological, clinical and engineering aspects of binocular vision 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact 30 researchers (undergraduate, postgraduate, staff) from universities across the UK came together in a 2 day workshop to discuss the challenges of working in binocular vision from the perspectives of biology, engineering and clinical practice. We discussed practical ways in which individuals could work together more effectively across traditional discipline barriers.
Year(s) Of Engagement Activity 2015