Recognition and Localisation of Human Actions in Image Sequences

Lead Research Organisation: Queen Mary University of London

Department Name: Sch of Electronic Eng & Computer Science

Abstract

The explosion in the amount of generated and distributed digital visual data that we nowadays witness can only be paralleled to the similar explosion in the amount of textual data that has been witnessed the decade before. However, while retrieval based on textual information made great progress and resulted in commercially usable search engines (e.g. Google, Yahoo), vision-based retrieval of multimedia material remains an open research question. As the amount of produced and distributed videos increases at an unprecedented pace, the significance of having efficient methods for content-based indexing in terms of the depicted actions can hardly be overestimated. In particular in the domain of analysis of human motion progress is expected to boost applications in human computer interaction, health care, surveillance, computer animation and games, and multimedia retrieval. However, mapping low level visual descriptors to high level action/object models is open problem and the analysis faces major challenges to the degree that the analysed image sequence exhibits large variability in appearance and the spatiotemporal structure of the actions, occlusions, cluttered backgrounds and large motions. In addition learning structure and appearance models is hindered by the fact that segmentation and annotation for the creation of training datasets are onerous tasks. For these reasons, there is a great incentive for the development of recognition and localisation methods that can either learn from few annotated examples or in a way that minimizes the amount of required manual segmentation and annotation.This project will build on recent development in Computer Vision and Pattern Recognition in order to develop methods for recognition and localisation of human and animal action categories in image sequences. Once trained, the methods should be able to detect and localise in a previously unknown image sequence, all the actions that belong to one of the known categories. The methods will allow learning the models in an incremental way starting from few examples and will allow computer assisted manual interaction using appropriate interfaces in order to facilitate model refinement. The methodologies will allow training the models in image sequences in which there is significant background clutter, that is in the presence of multiple objects/actions in the scene and moving cameras. No prior knowledge of the anatomy of the human body is a-priori considered, and therefore the models will be able to identify a large class of action categories, including facial/hand/body actions, animal motion, as well as interaction between humans and objects in their environment (such as drinking a glass of water).

Funded Value:

£340,932

Funded Period:

Sep 09 - Nov 12

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/G033935/1

Principal Investigator:

Ioannis Patras

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Image & Vision Computing (100%)

Organisations

People	ORCID iD
Ioannis Patras (Principal Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 > >|

10 25 50

Guo W (2012) Tensor learning for regression. in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

Heng Yang (2013) Privileged information-based conditional regression forest for facial feature detection

Irene Kotsia (Co-Author) (2012) Human action localization with support tensor machines

Kaymak S (2013) Computer Vision - ACCV 2012 Workshops

Koelstra S (2010) A dynamic texture-based approach to recognition of facial actions and their temporal models. in IEEE transactions on pattern analysis and machine intelligence

Koelstra S (2013) Fusion of facial expressions and EEG for implicit affective tagging in Image and Vision Computing

Kotsia I (2012) Support tensor action spotting

Kotsia I (2013) Computer Vision - ACCV 2012

Kotsia I (2011) Support tucker machines

Kotsia I (2010) Multiplicative Update Rules for Multilinear Support Tensor Machines

Key Findings
Impact Summary
Further Funding
Research Databases and Models
Collaboration
Software and Technical Products


Description	The project has developed machine learning and computer vision methods for analysis of facial expressions and body gestures so as to recognise behaviour and actions of human in natural environments. We have advanced the state of the art and have shown that machines are getting better at performing such tasks.
Exploitation Route	1) Content providers and distributors could utilise the results on facial expression analysis for inferring people's affective and cognitive state while watching films and TV programs. 2) Robot manufacturers could utilise the methods for facial expression analysis and gesture recognition for natural interfaces. 3) Gaming companies could use both the pose estimation and the gesture recognition results for game control. 4) Applications like interactive programs that guide people through their daily exercises could be built based on the technology for gesture recognition and pose estimation. The research can be utilised by companies and academic institutions that are interested in behaviour analysis. This includes analysis of human behaviour for assisted living (e.g of elderly people), or restaurants/shops that monitor costumer behaviour and/or preferences and/or interaction with products and/or reaction to provided services. Our work on facial expression analysis can be used for analysing human reactions (e.g. affective states) to presentation of multimedia content. In the later direction, and in collaboration with partners from the FP7 Network of excellence Petamedia, results are already obtained and published. Digital media companies can also utilise the findings. Specifically, the action spotting and action recognition algorithms developed in this project can be used for video annotation and/or retrieval system for better managing digital media. Academic researchers can also utilised the theoretical findings of our work. In particular our works on tensor-based regression/classification or our works on max-margin non negative matrix factorisation are core pattern recognition methodologies with applications beyond the field of Computer Vision. The dissemination efforts include a dedicate website (http://www.eecs.qmul.ac.uk/~ioannisp/ralis.htm) Source code for several of our methods is provided online (http://www.eecs.qmul.ac.uk/~ioannisp/source.htm)
Sectors	Creative Economy,Digital/Communication/Information Technologies (including Software),Healthcare
URL	http://www.eecs.qmul.ac.uk/~ioannisp/ralis.htm


Description	The work on localisation of human actions, and in particular the works on part-based models laid the foundations for research that led to collaboration with Yamaha Motors Ltd. That collaboration led to a follow up project with Yamaha, and a submitted patent for a pedestrian detection system (Spring2016). The work on human motion analysis is also supportive of a recently awarded InnovateUK project (SensingFeeling) that aims to monitor and access the affective state of people in the retail environment.
Sector	Creative Economy,Leisure Activities, including Sports, Recreation and Tourism,Manufacturing, including Industrial Biotechology,Transport
Impact Types	Societal,Economic


Description	Direct Industrial Funding (from Yamaha Motors Ltd)
Amount	£150,000 (GBP)
Organisation	Yamaha Motors
Sector	Private
Country	United Kingdom
Start	11/2014
End	11/2016


Title	Machine Learning codes
Description	Methods for data analysis, classification and regression.
Type Of Material	Data analysis technique
Year Produced	2011
Provided To Others?	Yes
Impact	The code has been used by a few researchers worldwide.
URL	http://www.eecs.qmul.ac.uk/~ioannisp/source.htm


Description	Academic Visit of Dr. Javier Trevor
Organisation	Jaume I University
Country	Spain
Sector	Academic/University
PI Contribution	Dr. Javier Trevor, an academic at Universitat Jaume I, Spain, collaborated with me during a 6 month visit at QMUL. The visit was funded by an award obtained by the Spanish Ministry of Education and Research. The related proposal was entitled: Human Action Recognition with partial and hidden information
Start Year	2011


Description	Collaboration with Imperial College
Organisation	Imperial College London
Country	United Kingdom
Sector	Academic/University
PI Contribution	Ongoing collaboration with the group of Prof. Pantic in facial and body gesture analysis that resulted in several publications. Joint supervision of Antonis Oikonomopoulos which resulted in several papers in action recognition. Collaboration in pose-invariant facial expression recognition in the framework of the work of Ognjen Rudovic
Start Year	2009


Description	Collaboration with Institute on Telematics and Informatics
Organisation	Centre for Research and Technology Hellas (CERTH)
Country	Greece
Sector	Academic/University
PI Contribution	Within the framework of a Doctorate Programme that I initiated, Informatics and Telematics Institute (ITI-CERTH, Greece) funded the salaries and paid the fees of several researchers enrolled as PhD students in QMUL under my supervision. 2 students have already graduated, two will graduate until 2017 and 1 will enrol in Spring 2017. I am co-supervisor of the students.
Collaborator Contribution	Informatics and Telematics Institute (ITI-CERTH, Greece) funded the salaries and paid the fees of the researchers, provides equipment and travel costs and co-supervision of the research.
Impact	In the period 2009 - 2015 the collaboration has resulted in 25 publications
Start Year	2009


Title	Max-Margin Semi-NMF
Description	This code implements the paper Max-Margin Semi-NMF (MNMF) as presented in Vijay Kumar, Irene Kotsia and Ioannis Patras, "Max-Margin Semi-NMF", in BMVC 2011.
Type Of Technology	Software
Year Produced	2012
Impact	The paper has been cited 10 times since 2012.
URL	http://www.eecs.qmul.ac.uk/~ioannisp/source.htm


Title	Support Tucker Machines
Description	This code implements Support Tucker Machines (STuMs) and Sw-STuMs, as presented in Irene Kotsia and Ioannis Patras, "Support Tucker Machines", in CVPR 2011, 2011.
Type Of Technology	Software
Year Produced	2012
Impact	The paper has been cited 22 times since 2012.
URL	http://www.eecs.qmul.ac.uk/~ioannisp/source.htm


Title	Tensor Regression
Description	This code implements Support Tensor Regression (STR) as presented in Weiwei Guo, Irene Kotsia and Ioannis Patras, "Tensor Learning for Regression", in IEEE Transactions on Image Processing, 2011.
Type Of Technology	Software
Year Produced	2012
Impact	The paper has been cited by 30 researchers since 2012
URL	http://www.eecs.qmul.ac.uk/~ioannisp/source.htm

Abstract

Organisations

People

ORCID iD

Publications