Multi-Object Video Behaviour Modelling for Abnormality Detection and Differentiation

Lead Research Organisation: Queen Mary University of London
Department Name: Sch of Electronic Eng & Computer Science

Abstract

There are over 4.2 million closed-circuit television (CCTV) surveillance cameras operational in the UK and many more worldwide, collecting a colossal amount of video data for security, safety, and infrastructure and facility management purposes. A typical existing CCTV system relies on a handful of human operators at a centralised control room for monitoring video inputs from hundreds of cameras. Too many cameras and too few operators leave the system ill equipped to fulfil the task of detecting events and anomalies that require immediate and appropriate response. Consequently, the use of the existing CCTV surveillance systems is limited predominately to post-mortem analysis. There is thus an increasing demand for automated intelligent systems for analysing the content of the vast quantities of surveillance videos and triggering alarms in a timely and robust fashion. One of the most critical components and functionalities of such a system is to monitor object behaviour captured in the videos and detect/predict any suspicious and abnormal behaviour that could pose a threat to public safety and security. This project aims to develop underpinning capabilities for an innovative intelligent video analytics system for detecting abnormal video behaviour in public spaces. More specifically, the project will address three open problems:1.To develop a new model for spatio-temporal visual context for abnormal behaviour detection. Behaviours are inherently context-aware, exhibited through constraints imposed by scene layout and the temporal nature of activities in a given scene. Consequently, the same behaviour can be deemed as either normal or abnormal depending on where and when it occurs. We aim to go beyond the state-of-the-art semantic scene modelling approaches, most of which are focused solely on modelling scene layout such as entry and exit points, by developing a more comprehensive spatio-temporal model of dynamic visual context. 2.To develop a novel multi-object behaviour model for real-time detection and differentiation of abnormalities in complex video behaviours that involve multiple objects interacting with each other (e.g. a group of people meet in front of a ticket office at a train station and then go to different platforms). 3.To develop a novel online adaptive learning algorithm for estimating the parameters of the behaviour model to be developed. Although video abnormality detection tools are already available in many existing CCTV control systems, human operators are often reluctant to use them because there are too many parameters to tune and re-tune for different scenarios given changing visual context. With the incremental and adaptive learning algorithm our behaviour model can be used for different surveillance scenarios over a long period of time with minimal human intervention. More importantly, using the algorithm, our behaviour model will become adaptive to both changes of visual context (therefore the definition of normality/abnormality), and valuable feedbacks from human operators on the abnormality detection output of the model.

Publications

10 25 50
 
Description The project has the following key findings:

(1) We have developed a machine learning algorithm that can detect abnormal object behaviour in video with very few examples. For example, if a CCTV (closed circuit television) operator watches a short clip of CCTV footage and notices something that is out of norm, he/she only needs to give an indication that there is an abnormality in the clip. The developed algorithm can then automatically identify what the abnormality is and build a model to detect it if it occurs again in the future. This is achieved by developing a novel topic model with fast learning and inference algorithm. Such an algorithm is very useful in practice where asking human to locate and describe a behaviour anomaly in video is hard, but giving a binary indicator on whether there is an anomaly in the video is easier.

(2) Human feedback can be prompted at the right moment and exploited subsequently to improve the performance of a classifier, as well as discover more new classes as quick as possible. This is achieved by formulating a novel active learning criterion which needs no parameter to tune. With this approach we can address many real world problems involving rare class discovery and classification, including video anomaly detection, financial fraud detection, and computer network intrusion detection. For example, this approach can be used together with the anomaly detection algorithm mentioned above. More specifically, if a human operator is asked to give feedback on every single video clip, he or she will be overwhelmed. However, if our model can automatically identify the most important video clips for improving the model , only feedback for those clips is needed, reducing the work load of human significantly.

(3) For detecting objects from images or actions from video, it is possible to learn a model with weak supervision from human, that is, only whether or not the image/video contain the object/action of interest is required to be annotated, rather than their precise locations. This makes learning a detector a much easier task. For example, one can search the keyword "cat" on Google Image and get thousands of images containing cats. Without needing to locate exactly where the cat is in each image, our approach can build a model for cat and use it to locate each cat not only in the retrieved images, but also in any unseen images.
Exploitation Route The research community in the areas of computer vision and machine learning can benefit directly from the research publications generated from the project (18 of them). Many of the project publications have high citation numbers.


The outcomes of the project can be used in non-academic context in a number of ways:

(1) General public's safety and security can be improved by applying the abnormal behaviour detection techniques developed in the project. In particular, by detecting anomalies as they occur or even before they are about to happen, law enforcement agencies can act promptly to protect the lives and properties of citizens.

(2) Operators of CCTV surveillance system can benefit greatly from the outcomes of the project. In particular, the operators can be better focused on the automatically identified suspicious behaviour rather than spread their attention across dozens or even hundreds of video feeds.

(3) Social media sharing website users can benefit from the project outcomes by having more efficiently image and video search tools.

(4) Visually impaired individuals can also benefit via improved methods for describing and summarising image and video data. The developed techniques can potentially be integrated into a system that can automatically annotate the content of images and videos. The generated annotation can be read out by a voice synthesiser.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Security and Diplomacy,Transport

 
Description Royal Society International Exchange Programme
Amount £11,200 (GBP)
Funding ID IE110976 
Organisation The Royal Society 
Sector Charity/Non Profit
Country United Kingdom
Start 04/2012 
End 03/2014
 
Title Dynamic background model 
Description dpgmm: (Apache 2.0) A Dirichlet process Gaussian mixture model, implemented using the mean field variational technique. Its about as good as a general purpose density estimator can get, though it suffers from a heavy computational and memory burden. Code is in pure python, depends on the gcp module and it is very neat and reasonably well commented - speed is reasonable for the method as the code vectorises well. Unlike some other implementations handles incremental learning correctly - both adding extra sticks to the model and adding extra data after convergence. 
Type Of Technology Software 
Year Produced 2013 
Open Source License? Yes  
Impact The dynamic background model has been very popular among the research community and many other researchers benefit from using the code released by us. 
URL https://github.com/thaines/helit