Exploiting spatial cognition in picture database design
Lead Research Organisation:
University of Nottingham
Department Name: Sch of Psychology
Abstract
The commercial and functional potential of picture databases is very great for a range of applications from medicine to medieval history. However, this potential is proving difficult to deliver and research activity in this area is intense. There are two principal approaches: i) extending existing methods to encode pictures by description and keywords; and ii) computational analysis of images to capture superficial aspects such as colour and texture; aiming to remove the effort of entering pictures into a database and to allow user's crude depictions to access subsets of pictures to be searched for recognition. However, current opinion suggests neither approach is likely in the medium term to deliver cost-effective solutions to the problem except in highly specialised areas.Our research proposes to innovate by considering the problem from a third, psychological, perspective: human spatial cognition is robust, and we are generally good at inspecting pictures and recalling their spatial layout later. Furthermore, the layout of most images can be described in ways that preserve elements of meaning and visual distinctiveness. Ideally therefore, databases that encode location information in pictures, and allow users to use that information in retrieval, represent a match of human skills with a method generally applicable to most task domains. To this end, this proposal links two lines of psychological research. First, we are interested in visual attention: how do people look at pictures? For the purposes of database design, we are interested in the relationship between picture content and attention; as expressed in eye movements. Although eye movements are variable, they do show elements of consistency. We will be concerned with how best to represent and evaluate this consistency as a function of factors such as: the picture content; different observers; task domain; and delay between storage and retrieval.Second, we aim to study how the spatial layout of images is remembered as a consequence of attention. Can we use our understanding of visual attention processes (and eye movements in particular) to predict spatial recall? How precise is this spatial knowledge, how could it be used, and how discriminating is it in the retrieval of images from a database? There are two issues here: (i) We know that some location knowledge is acquired very quickly in the inspection processes. This is also the stage when the viewer's eye movements are more predictable by computer because they are driven by visual analysis of the image and less upon its meaning. It follows that if we can model the relationship between early eye movements and location memory, and if that memory is useful in retrieval, then some indexing of pictures into databases can be automated. This research aims to evaluate this potential; (ii), As inspection continues, eye movements become harder to predict as the viewer's understanding of the content of the image develops. We aim to show how this meaning influences eye movements and the impact of this upon location memory beyond that gained in the early stages of viewing. Overall, these two complementary questions will tell us how much picture coding can be automated and how task- and user- specific factors will influence design.As a study of the feasibility of this innovation to the design of picture databases, this proposal also considers the adaptability and efficiency of the approach in different circumstances. Accordingly, in evaluating the cost benefits to picture databases, the project will seek to measure the contribution of: domain expertise, training, and some interface design issues. This will indicate whether the approach has general applicability in picture databases or whether it is best applied to bespoke, specialist, systems where training and expertise is required.
Organisations
People |
ORCID iD |
Geoffrey Underwood (Principal Investigator) |
Publications
Foulsham T
(2009)
Does conspicuity enhance distraction? Saliency and eye landing position when searching for objects.
in Quarterly journal of experimental psychology (2006)
Foulsham T
(2008)
Turning the world around: patterns in saccade direction vary with picture orientation.
in Vision research
Foulsham T
(2010)
If Visual Saliency Predicts Search, Then Why? Evidence from Normal and Gaze-Contingent Search Tasks in Natural Scenes
in Cognitive Computation
Humphrey K
(2009)
Domain knowledge moderates the influence of visual saliency in scene recognition.
in British journal of psychology (London, England : 1953)
Humphrey K
(2010)
The potency of people in pictures: evidence from sequences of eye fixations.
in Journal of vision
Humphrey K
(2010)
See What I'm Saying? Expertise and Verbalisation in Perception and Imagery of Complex Scenes
in Cognitive Computation
Humphrey K
(2012)
Salience of the lambs: a test of the saliency map hypothesis with pictures of emotive objects.
in Journal of vision
Underwood G
(2009)
Saliency and scan patterns in the inspection of real-world scenes: Eye movements during encoding and recognition
in Visual Cognition
Underwood G
(2008)
Is attention necessary for object identification? Evidence from eye movements during the inspection of real-world scenes.
in Consciousness and cognition
Underwood G
(2009)
Cognitive Processes in Eye Guidance: Algorithms for Attention in Image Processing
in Cognitive Computation