Reverse Storyboarding for Video Content Based Retrieval

Lead Research Organisation: University of Surrey
Department Name: Vision Speech and Signal Proc CVSSP

Abstract

This proposal will explore the use of storyboard sketches as a means of automatically indexing and interactively querying large video databases. In doing so we address the Computer Vision problem of video Content Based Image Retrieval (CBIR) using techniques borrowed from animation and Computer Graphics.Storyboards are a powerful visual scripting technique commonly used by animators to plan and communicate the content of their productions. In simple terms, a storyboard is a series of visual images that illustrate a film's key scenes and events. However storyboards are more than static sketches --- they also depict dynamics using a variety of motion cues borrowed from contemporary animation; streak and ghosting lines, object deformations as well as more conventional indicators such as arrow-heads. Much as an artist's sketch is spatial abstraction of a static scene, so a storyboard may be considered a spatio-temporal abstraction depicting salient instants within a image sequence (video).We argue that storyboards have a number of unique properties that can be brought to bear on video CBIR. Storyboards confer a rich vocabulary with which to concisely pose search queries. They can represent a range of dynamics such as oscillation, translation, rotation, collision and occlusion. Previous use of temporal constraints in CBIR has been limited mainly to object trajectories and scene transitions, and we believe that the ability to specify additional dynamic constraints will help to improve search accuracy and performance. Storyboards are also a highly conventional and comprehensible narrative form --- comic books, for example, are used to convey stories to non-technical audiences of all ages. Thus we believe that storyboards will also provide a natural and usable interface for video CBIR.The proposed research will build upon the PI's recent work on video stylisation; specifically on rendering video into cartoons. We will build upon our Video Paintbox software to diversify the gamut of motion cues we are able to infer from real video footage. These techniques will be harnessed to parse dynamic cues from clips stored in video databases. We will develop an ontology for the representation of dynamic cues in video, and develop novel algorithms for matching search queries to the database based on spatial and dynamic content. This work will be integrated to produce a demonstrable storyboard driven CBIR system. This system will enable us test our hypothesis that the intuitively comprehensible nature of storyboard sketches, combined with their compact representation and rich vocabulary of dynamics, may be leveraged to enhance usability, accuracy and performance of video CBIR.