Binocular vision and the statistics of binocular disparity

Lead Research Organisation: University of St Andrews
Department Name: Psychology


Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

By comparing the images formed in our two eyes, we are able to effortlessly perceived the three-dimensional world around us in vivid detail. This ability far surpasses that of any computer or robot based vision systems currently available. We will invesigate how this might be achieved by our brains. We will test the notion that our brain combines the information gathered by our eyes with its own prior assumptions or best-guesses about the most likely structure of the world around us. We will test the nature of any such biases, and whether they represent an accurate reflection of our environment.


10 25 50
Description Our two eyes view the world from slightly different vantage points. The resulting differences in the images formed on our two retinas are an important source of information about the three-dimensional shape and position of objects. The significance of this information is highlighted, for example, by the recent interest in 3D films at the cinema, and the promise of 3D television in our homes in the near future. The depth effects in these films depend on the presentation of different images to the two eyes.

For us to make sense of this information, and move from the two-dimensional retinal images to the vivid appearance of a three-dimensional world, is a difficult computational problem. We are beginning to understand how this is solved. The first stage in this process is to identify regions of the images in the two eyes that are most similar. However, this simple process does not result in anything like a good description of the three-dimensional shape of objects. The perception of depth therefore requires significantly more than this.

Our research addressed one simple idea about what this something more might be. Specifically, we reasoned that as our environment is relatively predicable, we might makes assumptions about the shape and layout of objects with which to guide the interpretation of the retinal images. As an example, as an object is moved further away from an observer, the images it will form in their eyes will get smaller and smaller. Equally, because things tend to be opaque, as we consider objects at greater distances, it becomes more likely that part or all of the object will not be visible because it will be behind a nearer object. We might therefore expect that most points in the world that we can see originate from relatively close objects. This is born out when we measure the distribution of distances in the real world.

The idea addressed by our research was: are such statistical expectations useful when trying to understand how people make sense of the three-dimensional structure of their environment?

Specifically, we asked the following questions:

Are some matches between the eyes preferred over others?
Are some surfaces easier to see than others?
Are we biased in the way we perceive shape?

In all cases, we can ask whether the preferences shown reflect the kinds of surfaces we might expect in our environment.

The answer to the first question is yes, we prefer to match things so as to minimise the depth variation across the scene as a whole. This is more important than matching to produce a smooth surface. It is also inconsistent with simple matching models, and shows one way in which they can be improved.

The answer to the second question is also yes. Surfaces with neither too much nor too little depth are more readily detected. These results are again not consistent with simple matching models, which predict the best detection for surfaces with the smallest depth variation. The results are however consistent with the distributions of slant in the natural environment.

The answer to the final question is again yes. However, we can account for these biases by assuming that the observer is simply trying to estimate the differences between the two eyes' views as accurately as possible. Inclusion of information derived from analyses of the natural environment to our model of this process does not improve our understanding of these biases.

We have identified the kinds of surfaces that observers are best at seeing. We have highlighted ways in which these preferences are not compatible with a simple model of binocular vision, but consistent with the kinds of surfaces found in the environment. We have also built a mathematical model to account for the biases shown by observers. However, when we extend this to take account of measurements of the structure of the environment we do not shed additional light on the process.
Exploitation Route These findings provided some of the first work on understanding the statistical structure of binocular images. This can be, and has been, built up by other scientists seeking to understand how our visual system (from individual neurons to the whole system) is optimised for processing this information. It can also be used to inspire computer vision algorithms, and in optimising 3D content in films, TV and other display technologies.
Sectors Aerospace, Defence and Marine,Creative Economy,Digital/Communication/Information Technologies (including Software),Leisure Activities, including Sports, Recreation and Tourism,Culture, Heritage, Museums and Collections

Description The main impact of the work has been within the academic community, as it was fundamental research that was not directly linked to a specific application. During and after the award we have taken many opportunities to engage with the public (mainly through science days and museum exhibits and public talks). However, the main focus of this has been the general areas of human vision, 3D vision. This more general approach to engaging with the public, rather than a constrained focus on the specific outcomes of the particular research, is more appropriate for the type of work, and we believe that the public will have benefit much more from this approach.
First Year Of Impact 2006
Sector Education,Culture, Heritage, Museums and Collections
Impact Types Cultural

Title 3D Image database 
Description Collection of calibrated binocular images 
Type Of Material Database/Collection of data 
Year Produced 2011 
Provided To Others? Yes  
Impact Scientific publications Perceived direction of motion determined by adaptation to static binocular images KA May, L Zhaoping, PB Hibbard Current Biology 22 (1), 28-32