Testing view-based and 3D models of human navigation and spatial perception

Lead Research Organisation: University of Reading

Department Name: Sch of Psychology and Clinical Lang Sci

Abstract

The way that animals use visual information to move around and interact with objects involves a highly complex interaction between visual processing, neural representation and motor control. Understanding the mechanisms involved is of interest not only to neuroscientists but also to engineers who must solve similar problems when designing control systems for autonomous mobile robots and other visually guided devices.

Traditionally, neuroscientists have assumed that the representation delivered by the visual system and used by the motor system is something like a 3D model of the outside world, even if the reconstruction is a distorted version of reality. Recently, evidence against such a hypothesis has been mounting and an alternative type of theory has emerged. 'View-based' models propose that the brain stores and organises a large number of sensory contexts for potential actions. Instead of storing the 3D coordinates of objects, the brain creates a visual representation of a scene using 2D image parameters, such as widths or angles, and information about the way that these change as the observer moves. This project examines the human representation of three-dimensional scenes to help distinguish between these two opposing hypotheses.

To do this, we will use immersive virtual reality with freely-moving observers to test the predictions of the 3D reconstruction and 'view-based' models. Head-tracked virtual reality allows us to control the scene the observer sees and to track their movements accurately. Certain spatial abilities have been taken as evidence that the observer must create a 3D reconstruction of the scene in the brain. For example, people are able to view a scene, remember where objects are, walk to a new location and then point back to one of the objects they had seen originally even if it is no longer visible (i.e. people can update the visual direction of objects as they move). However, this capacity does not necessarily require that the brain generate a 3D model of the scene and, as evidence, we will extend view-based models to include this pointing task and others like it. We will then test the predictions of both view-based and 3D reconstruction models against the performance of human participants carrying out the same tasks.

As well as predicting the pattern of errors in simple navigation and pointing tasks, we will also measure the effect of two types of stimulus change. 3D reconstruction uses 'corresponding points' which are points in an image that arise, for example, from the same physical object (or part of an object) as a camera or person moves around it. Using a novel stimulus, we will keep all of these 'corresponding points' in a scene constant yet, at the same time, changing the scene so that the images alter radically when the observer moves. This manipulation should have a dramatic effect on a view-based scheme but no effect at all on any system based only on corresponding points.

Overall, we will have a tight coupling between experimental observations and quantitative predictions of performance under two types of model. This will allow us to determine which of the two models most accurately reflects human behaviour in a 3D environment. One potential outcome of the project is that view-based models will provide a convincing account of performance in tasks that have previously been considered to require 3D reconstruction, opening up the possibility that a wide range of tasks can be explained within a view-based framework.

Planned Impact

Our experiments aim to deliver a more accurate model of human spatial representation and navigation behaviour than exists at present. There are clear industrial applications for this type of knowledge in several distinct areas. We have two existing collaborations that allow us to have a direct impact. First, we have a long-standing relationship with Microsoft Research Cambridge, who fund a current PhD student with us (since October 2011). Andrew Fitzgibbon, who co-supervises the project, and others at Microsoft (John Winn, Antonio Criminisi) are interested in algorithms that use non-Cartesian, view-based representations for applications that have traditionally relied on 3D metric models.

Second, we have a collaboration with the car manufacturer, Renault. They perform much of their car-interior prototyping in virtual reality, and are keenly interested in perception data coming from our lab in order to determine which types of scene manipulation will have a noticeable perceptual effect and which will not. They also have a strong interest in the calibration methods and high-quality virtual reality that we have available in our laboratory. Renault will fund a new PhD student in our lab starting in 2012.

Our laboratory is involved in a range of out-reach activities, including open days for the public and for school children from the Sutton Trust. Dr Glennerster has advertised the work of the lab giving plenary and other talks at 3DTV conferences, where producers and technologists alike are interested in problems with the way that 3DTV and 3D cinema is interpreted and perceived. Dr Glennerster has given public lectures (e.g. Royal College of Surgeons) and our laboratory has engaged the public in a demonstration at the Royal Society of the mutations involved in the potassium channel affected in neonatal diabetes: children could fly through a model of their own channel as it opened and closed and was 'mended' by the drug that cured their condition.

Funded Value:

£419,878

Funded Period:

Feb 13 - Sep 16

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/K011766/1

Principal Investigator:

Andrew Glennerster

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Image & Vision Computing (30%)

Vision & Senses - ICT appl. (70%)

Organisations

People	ORCID iD
Andrew Glennerster (Principal Investigator)

Publications

Author Name Title Publication

Date Published

|< < 1 2 > >|

10 25 50

Glennerster Andrew (2018) A single coordinate framework for optic flow and binocular disparity in arXiv e-prints

Glennerster A (2015) Visual stability-what is the problem? in Frontiers in psychology

Gootjes-Dreesbach L (2017) Comparison of view-based and reconstruction-based models of human navigational strategy. in Journal of vision

Muryy A (2017) Navigation and pointing errors in non-metric environments. in Journal of Vision

Scarfe P (2015) Using high-fidelity virtual reality to study perception in freely moving observers. in Journal of vision

Scarfe P (2021) Combining cues to judge distance and direction in an immersive virtual reality environment. in Journal of vision

Glennerster A (2016) A moving observer in a three-dimensional world. in Philosophical transactions of the Royal Society of London. Series B, Biological sciences

Muryy A (2021) Route selection in non-Euclidean virtual environments. in PloS one

Vuong J (2019) No single, stable 3D representation can explain pointing biases in a spatial updating task. in Scientific reports

Scarfe P (2014) Humans use predictive kinematic models to calibrate visual cues to three-dimensional surface slant. in The Journal of neuroscience : the official journal of the Society for Neuroscience

Key Findings
Further Funding
Collaboration
Engagement Activities


Description	One paper published from this grant shows that sensory adaptation (in this case, people re-calibrate their sense of slant) depends on how they interpret physical interactions of objects (in this case a ball bouncing differently when it is spinning). This demonstrates that 'low-level cues' such as the range of slants people see, are not the only cause of adaptive changes in the visual system. We have also published a review paper on how to set up a high-fidelity virtual reality lab (at one point this was the most downloaded paper in Journal of Vision) and a theoretical paper on the problem of visual stability. A major review paper on the problem of spatial representation in a moving observer was published in Phil Trans B. This sets out a radical alternative to theories that suppose the brain builds 3D 'maps' or reconstructions of the scene. Another paper comparing view-based and reconstruction models as predictors of human navigation in a homing task is now published in Journal of Vision. Another paper has been published in LNCS (and another is on BioRXiv and under review in PLoS ONE) describing the errors that people make when they point at remembered objects while walking in a maze and when they try to take shortcuts. A paper on spatial updating of objects when an observer moves is published in Scientific Reports. This discusses the importance of Generative Query Networks as potential models for human navigation and spatial representation.
Exploitation Route	We are collaborating with engineers who are interested in adaptive control systems. These results have implications for the design of such systems. We are now developing a more extensive collaboration with Professor Phil Torr's group the Department of Engineering in Oxford and with Professor Abhinav Gupta's lab in the Robotics Group at Carnegie Mellon University. The VR lab results will be used to inform and adapt machine learning techniques for learning spatial layout and will inform ideas about spatial representation in humans. Navigation systems in autonomous vehicles may in future use representations that are more like those that have evolved in animals. Understanding these may therefore have important economic consequences.
Sectors	Digital/Communication/Information Technologies (including Software)
URL	http://www.personal.reading.ac.uk/~sxs05ag/


Description	The action-based brain: a provocation to philosophy, robotics and the cognitive sciences
Amount	£30,982 (GBP)
Funding ID	AH/N006011/1
Organisation	Arts & Humanities Research Council (AHRC)
Sector	Public
Country	United Kingdom
Start	03/2016
End	02/2017


Description	Understanding Scenes and Events through Joint Parsing, Cognitive Reasoning and Lifelong Learning
Amount	£443,434 (GBP)
Funding ID	EP/N019423/1
Organisation	Engineering and Physical Sciences Research Council (EPSRC)
Sector	Public
Country	United Kingdom
Start	02/2016
End	01/2019


Description	Collaboration with Microsoft Research, Cambridge
Organisation	Microsoft Research
Department	Microsoft Research Cambridge
Country	United Kingdom
Sector	Private
PI Contribution	co-supervising PhD student
Collaborator Contribution	co-supervising PhD student, trips to Reading, Cambridge and plans to hold a conference July 1-3rd 2015 at Microsoft Research Cambridge.
Impact	Yes, this is multi-disciplinary (computer vision and human psychophysics). Scientific Reports publication (2019).
Start Year	2011


Description	Collaboration with Phil Torr's group in Robotics, University of Oxford
Organisation	University of Oxford
Department	Department of Engineering Science
Country	United Kingdom
Sector	Academic/University
PI Contribution	We have begun a collaboration that will be extended as part of EPSRC grant EP/N019423/1. We will provide access to the Virtual Reality lab in Reading and psychophysical expertise. The aim is to compare human performance on navigation tasks with that of reinforcement learning techniques trained on games that require navigation to obtain rewards. We are currently writing a grant together to submit to EPSRC to continue this collaboration.
Collaborator Contribution	The Torr group will carry out the modelling described above.
Impact	Multidisciplinary: neuroscience and computer vision/machine learning.
Start Year	2016


Description	Journal of Neuroscience press release
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Media (as a channel to the public)
Results and Impact	We entered into discussion with a BBC journalist about covering our lab's work on VR. The journalist has said he will follow this up with a program covering new developments in VR, particularly following the $2billion investment by Facebook in Oculus Rift
Year(s) Of Engagement Activity	2014
URL	http://peterscarfe.com/bounceRecalibration.html


Description	Microsoft Research Cambridge conference
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	15 academics from USA, Europe and UK and members of Microsoft Research (MSR) Cambridge met at MSR to discuss 'view-based' approaches to spatial representation. This led to very fruitful exchange of ideas between the computer vision and neuroscience communities and should result in two publications.
Year(s) Of Engagement Activity	2015
URL	http://www.glennersterlab.com/MSRMeeting2015/index.html


Description	cBBC coverage
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Media (as a channel to the public)
Results and Impact	cBBC approached us to ask about virtual reality. Our lab comes up readily on searches for virtual reality. We participated in a program about the future of VR. It included VR from the hap tics group in Systems Engineering at Reading with whom we collaborate. Journalists at the BBC say they will contact us again in relation to similar topics.
Year(s) Of Engagement Activity	2014
URL	http://bbc.in/1nJ4f1N

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications