Recognition and Localisation of Human Actions in Image Sequences
Lead Research Organisation:
Queen Mary University of London
Department Name: Sch of Electronic Eng & Computer Science
Abstract
The explosion in the amount of generated and distributed digital visual data that we nowadays witness can only be paralleled to the similar explosion in the amount of textual data that has been witnessed the decade before. However, while retrieval based on textual information made great progress and resulted in commercially usable search engines (e.g. Google, Yahoo), vision-based retrieval of multimedia material remains an open research question. As the amount of produced and distributed videos increases at an unprecedented pace, the significance of having efficient methods for content-based indexing in terms of the depicted actions can hardly be overestimated. In particular in the domain of analysis of human motion progress is expected to boost applications in human computer interaction, health care, surveillance, computer animation and games, and multimedia retrieval. However, mapping low level visual descriptors to high level action/object models is open problem and the analysis faces major challenges to the degree that the analysed image sequence exhibits large variability in appearance and the spatiotemporal structure of the actions, occlusions, cluttered backgrounds and large motions. In addition learning structure and appearance models is hindered by the fact that segmentation and annotation for the creation of training datasets are onerous tasks. For these reasons, there is a great incentive for the development of recognition and localisation methods that can either learn from few annotated examples or in a way that minimizes the amount of required manual segmentation and annotation.This project will build on recent development in Computer Vision and Pattern Recognition in order to develop methods for recognition and localisation of human and animal action categories in image sequences. Once trained, the methods should be able to detect and localise in a previously unknown image sequence, all the actions that belong to one of the known categories. The methods will allow learning the models in an incremental way starting from few examples and will allow computer assisted manual interaction using appropriate interfaces in order to facilitate model refinement. The methodologies will allow training the models in image sequences in which there is significant background clutter, that is in the presence of multiple objects/actions in the scene and moving cameras. No prior knowledge of the anatomy of the human body is a-priori considered, and therefore the models will be able to identify a large class of action categories, including facial/hand/body actions, animal motion, as well as interaction between humans and objects in their environment (such as drinking a glass of water).
People |
ORCID iD |
Ioannis Patras (Principal Investigator) |
Publications
Guo W
(2012)
Tensor learning for regression.
in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Irene Kotsia (Co-Author)
(2012)
Human action localization with support tensor machines
Kaymak S
(2013)
Computer Vision - ACCV 2012 Workshops
Koelstra S
(2010)
A dynamic texture-based approach to recognition of facial actions and their temporal models.
in IEEE transactions on pattern analysis and machine intelligence
Koelstra S
(2013)
Fusion of facial expressions and EEG for implicit affective tagging
in Image and Vision Computing
Kotsia I
(2012)
Support tensor action spotting
Kotsia I
(2011)
Support tucker machines
Description | The project has developed machine learning and computer vision methods for analysis of facial expressions and body gestures so as to recognise behaviour and actions of human in natural environments. We have advanced the state of the art and have shown that machines are getting better at performing such tasks. |
Exploitation Route | 1) Content providers and distributors could utilise the results on facial expression analysis for inferring people's affective and cognitive state while watching films and TV programs. 2) Robot manufacturers could utilise the methods for facial expression analysis and gesture recognition for natural interfaces. 3) Gaming companies could use both the pose estimation and the gesture recognition results for game control. 4) Applications like interactive programs that guide people through their daily exercises could be built based on the technology for gesture recognition and pose estimation. The research can be utilised by companies and academic institutions that are interested in behaviour analysis. This includes analysis of human behaviour for assisted living (e.g of elderly people), or restaurants/shops that monitor costumer behaviour and/or preferences and/or interaction with products and/or reaction to provided services. Our work on facial expression analysis can be used for analysing human reactions (e.g. affective states) to presentation of multimedia content. In the later direction, and in collaboration with partners from the FP7 Network of excellence Petamedia, results are already obtained and published. Digital media companies can also utilise the findings. Specifically, the action spotting and action recognition algorithms developed in this project can be used for video annotation and/or retrieval system for better managing digital media. Academic researchers can also utilised the theoretical findings of our work. In particular our works on tensor-based regression/classification or our works on max-margin non negative matrix factorisation are core pattern recognition methodologies with applications beyond the field of Computer Vision. The dissemination efforts include a dedicate website (http://www.eecs.qmul.ac.uk/~ioannisp/ralis.htm) Source code for several of our methods is provided online (http://www.eecs.qmul.ac.uk/~ioannisp/source.htm) |
Sectors | Creative Economy Digital/Communication/Information Technologies (including Software) Healthcare |
URL | http://www.eecs.qmul.ac.uk/~ioannisp/ralis.htm |
Description | The work on localisation of human actions, and in particular the works on part-based models laid the foundations for research that led to collaboration with Yamaha Motors Ltd. That collaboration led to a follow up project with Yamaha, and a submitted patent for a pedestrian detection system (Spring2016). The work on human motion analysis is also supportive of a recently awarded InnovateUK project (SensingFeeling) that aims to monitor and access the affective state of people in the retail environment. |
Sector | Creative Economy,Leisure Activities, including Sports, Recreation and Tourism,Manufacturing, including Industrial Biotechology,Transport |
Impact Types | Societal Economic |
Description | Direct Industrial Funding (from Yamaha Motors Ltd) |
Amount | £150,000 (GBP) |
Organisation | Yamaha Motors |
Sector | Private |
Country | United Kingdom |
Start | 11/2014 |
End | 11/2016 |
Title | Machine Learning codes |
Description | Methods for data analysis, classification and regression. |
Type Of Material | Data analysis technique |
Year Produced | 2011 |
Provided To Others? | Yes |
Impact | The code has been used by a few researchers worldwide. |
URL | http://www.eecs.qmul.ac.uk/~ioannisp/source.htm |
Description | Academic Visit of Dr. Javier Trevor |
Organisation | Jaume I University |
Country | Spain |
Sector | Academic/University |
PI Contribution | Dr. Javier Trevor, an academic at Universitat Jaume I, Spain, collaborated with me during a 6 month visit at QMUL. The visit was funded by an award obtained by the Spanish Ministry of Education and Research. The related proposal was entitled: Human Action Recognition with partial and hidden information |
Start Year | 2011 |
Description | Collaboration with Imperial College |
Organisation | Imperial College London |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Ongoing collaboration with the group of Prof. Pantic in facial and body gesture analysis that resulted in several publications. Joint supervision of Antonis Oikonomopoulos which resulted in several papers in action recognition. Collaboration in pose-invariant facial expression recognition in the framework of the work of Ognjen Rudovic |
Start Year | 2009 |
Description | Collaboration with Institute on Telematics and Informatics |
Organisation | Centre for Research and Technology Hellas (CERTH) |
Country | Greece |
Sector | Academic/University |
PI Contribution | Within the framework of a Doctorate Programme that I initiated, Informatics and Telematics Institute (ITI-CERTH, Greece) funded the salaries and paid the fees of several researchers enrolled as PhD students in QMUL under my supervision. 2 students have already graduated, two will graduate until 2017 and 1 will enrol in Spring 2017. I am co-supervisor of the students. |
Collaborator Contribution | Informatics and Telematics Institute (ITI-CERTH, Greece) funded the salaries and paid the fees of the researchers, provides equipment and travel costs and co-supervision of the research. |
Impact | In the period 2009 - 2015 the collaboration has resulted in 25 publications |
Start Year | 2009 |
Title | Max-Margin Semi-NMF |
Description | This code implements the paper Max-Margin Semi-NMF (MNMF) as presented in Vijay Kumar, Irene Kotsia and Ioannis Patras, "Max-Margin Semi-NMF", in BMVC 2011. |
Type Of Technology | Software |
Year Produced | 2012 |
Impact | The paper has been cited 10 times since 2012. |
URL | http://www.eecs.qmul.ac.uk/~ioannisp/source.htm |
Title | Support Tucker Machines |
Description | This code implements Support Tucker Machines (STuMs) and Sw-STuMs, as presented in Irene Kotsia and Ioannis Patras, "Support Tucker Machines", in CVPR 2011, 2011. |
Type Of Technology | Software |
Year Produced | 2012 |
Impact | The paper has been cited 22 times since 2012. |
URL | http://www.eecs.qmul.ac.uk/~ioannisp/source.htm |
Title | Tensor Regression |
Description | This code implements Support Tensor Regression (STR) as presented in Weiwei Guo, Irene Kotsia and Ioannis Patras, "Tensor Learning for Regression", in IEEE Transactions on Image Processing, 2011. |
Type Of Technology | Software |
Year Produced | 2012 |
Impact | The paper has been cited by 30 researchers since 2012 |
URL | http://www.eecs.qmul.ac.uk/~ioannisp/source.htm |