Multi-activity 3D pose estimation on real environments

Lead Research Organisation: Queen's University of Belfast
Department Name: Electronics Electrical Eng and Comp Sci

Abstract

Human-machine interfaces, video surveillance, sport performance enhancement, physical therapy, smart environments, to name a few, are important societal challenges that require better automatic behaviour analysis to be fully addressed. In order to move closer to the level of human proficiency, fully automatic understanding of a scene requires a whole range of capabilities: reliable extraction of each actor involved, its pose and their activities. This involves the combined application of pose estimation, multi target tracking and activity recognition. While impressive progress has been made in those fields in isolation, reliable methods, able to be applied to real world and unconstrained environments, are still a challenge. In this project we will focus on the intermediate components of behaviour analysis, by disregarding the traditional cascade pipeline, where pose estimation frequently plays a secondary role or it is completely obliterated due to its complexity, and proposing a novel architecture which has 3D pose estimation as the key central component with feedback between each of the other components.

In this project, we propose to investigate the automated 3D pose estimation and tracking of multiple people in realistic scenarios. This research is suggested on the basis that all current methods perform under strong limitations and assumptions that preclude their application to real-world situations. Thus, while some methods require multiple high-resolution sensors, thereby ruling out the use of current and near future sensor network infrastructures, others struggle with scenes containing multiple persons, or they succeed on the basis of the subjects not interacting and also knowing the activity performed beforehand. This last assumption reduces the practical application of the pose estimation and prevents it use for activity recognition and/or behavioural analysis.

To address this limitation, in this project we propose to extend the assumption from one of a single known activity as prior model, to one where a class of multiple activities is assumed, e.g., walking, running, fighting, shaking hands etc. This requires us to develop a novel multi-activity model that could be used as prior information to accurately and robustly estimate the 3D pose under complex and real world conditions. This multi activity model will avoid presuming the performed activity by each of the subject in the scene among the given set of activities. The development and use of such a model is the key novel contribution of this proposal, and is a first step towards a fully activity-agnostic 3D pose estimation for real environments.

Furthermore, we propose a paradigm change to the conventional behaviour analysis chain, where pose estimation becomes the cornerstone of the system, and the feedback loops with tracking, to address occlusions and interactions, and activity recognition, to switch between a set of plausible activities during the estimation, allows us to deal with the aforementioned issues. By modelling transitions between this set of activities, and observing how predicted poses propagate in time through the activity space, the current activity can be recognised and used as feedback for refining the pose estimation. This is the second novelty of this proposal. Lastly, inaccuracies in the pose estimation, caused by occlusion and multiple persons interacting, can be overcome by using information from the tracker to determine image regions that provide reliable pose estimation information. Similarly, by knowing the pose and activity of subjects in the scene, the tracking performance can be improved. This is the third novel aspect of the proposal.

Planned Impact

This research is key to the development of next generation video analytics and surveillance systems. These systems should be able to mimic human performance, thereby ensuring enhanced situation awareness and leading to timely decision making. This proposal seeks to take advantage existing surveillance camera infrastructure, which translates into a reduced cost for companies and the tax payer. The societal impact is two-fold: detecting suspicious behaviour in real-time to prevent crime, and provide forensic evidence that can lead to increased convictions

Next generation video analytics will be key for the protection of Critical National Infrastructures, whose protection will become a major societal concern in the new smart society. Our proposal will enable automatic monitoring and analysis of the behaviour of staff and intruders, to reduce the potential damage and injuries to the staff and to the infrastructure. This is of particular interest for the national security of UK citizens, where the Centre for the Protection of National Infrastructure has identified terrorism as a severe threat to vital services, and has named physical security as 1 of the 3 main disciplines that need to be improved, with appropriate investment in CCTV and intruder alarms. The UK Government has also committed £860m to its national Cyber Security Programme. CSIT, with the Home office, PSNI and GCHQ as industrial advisory board members and collaborators in previous projects, is the perfect environment for this research to achieve this impact

Britain's aging population, the small average family size and a wide geographical dispersal, leave many elderly citizens isolated, with serious consequences for physical and mental health. UK Department of Health expects the number of elderly people over 65 to grow by 51% by 2030. This is currently a key issue in Parliament and has had an impact on the NHS, where a national target was set to increase the number of older people living at home. Assistive technology has shown its value in helping people live independently, improving the quality of life for users and carers, but it has not yet been used to its full potential. Our research will enable the detection of subtle differences between dangerous and normal behaviour while using affordable video sensors, improve the automatic and remote monitoring, increasing the coverage and fast response, while avoiding an increase in cost

Exploiting innovative technology such as this will increase the UK's competitive position both within Europe and on the global stage. The UK Security Market is expected to grow from £2.8bn in 2014 to £3.4bn by 2017 (Competitive Analysis of the UK Cyber Security Sector, PAC, 2013). As a main path, the outcomes of this research will be commercialised though a CSIT spin-off company, Cognition Video. Cognition business model is based on supplying advance analytics to Physical Security Information Management systems and provides. PSIMs provide a platform designed to integrate multiple unconnected security applications in order to identify security breach situations. The outcomes of this research will be added into the current system to provide cutting-edge capabilities that will allow the PSIM company to compete, to grow and to differentiate from the rest of the market, since current PSIM rely on basic analytics, creating new employments and economic gains.

Results will be also disseminated to the CSIT industrial advisory board, which includes major companies in the security sector, such as Thales, BAE and Roke Manor. CSIT also holds regular meetings, summits and white papers, integrating researchers with end users. While taking advantage of CSIT's position as the UK's Innovation and Knowledge Centre for cyber security for maximising impact, this research will also increase the reputation of CSIT as a global innovation hub, enhancing our ability to attract further investment from industry and public sector funds in Europe and beyond

Publications

10 25 50
 
Description -Individual activity models as well as multiactivity models are useful to improve pose estimation in complicated scenes. -A new neural network training mechanism to improve pose estimation -Combination of 2D and 3D pose estimation models are useful for getting more robust models in real life-environments
Exploitation Route The new learning mechanism for neural networks can be used to further improve machine learning based on deep learning in many applications. Use of activity-specific models to refine pose estimation. Use of partial or incomplete data to improve pose estimation in real environments
Sectors Digital/Communication/Information Technologies (including Software),Leisure Activities, including Sports, Recreation and Tourism,Security and Diplomacy

 
Description Findings in this research are being evaluated to perform pose estimation analysis on animals (chickens and cattle) applied to animal welfare. Our approach allows a continuos and automatic pose estimation that can be used to calculated an effective mobility score on the animals, cheking anomalies in their behaviour that relate to injures or limping. As main advantage regarding traditional visual inspection, this is effortless for the farmer and cost effective since cheap monocular cameras can be used in the farm to automatise the process.
First Year Of Impact 2019
Sector Agriculture, Food and Drink,Healthcare
Impact Types Societal,Economic

 
Description CSIT 2
Amount £5,032,504 (GBP)
Funding ID EP/N508664/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 04/2015 
End 03/2020
 
Description Capital Award emphasising support for Early Career Researchers
Amount £150,000 (GBP)
Funding ID EP/S017682/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 02/2019 
End 07/2020
 
Description Flockfocus - Developing automated surveillance tools to safeguard chicken welfare
Amount £221,955 (GBP)
Funding ID FFAR Sb-0000000021 - Flockfocus 
Organisation Foundation for Food and Agriculture Research 
Sector Charity/Non Profit
Country United States
Start 03/2020 
End 09/2021
 
Description Knowledge Transfer Partnerships Ref: KTP 12165 (Queen's University Belfast, Cattle Eye Limited)
Amount £171,801 (GBP)
Funding ID KTP 12165 
Organisation Cattle Eye Ltd 
Sector Private
Country United Kingdom
Start 09/2020 
End 09/2022
 
Description SymbIoT: A Symbiotic IoT Ecosystem for Smart Environments
Amount € 173,771 (EUR)
Funding ID SBPLY/17/180501/000334 
Organisation European Union 
Sector Public
Country European Union (EU)
Start 09/2018 
End 08/2021
 
Description Video-based semantic analysis for on crowded rail station
Amount £133,201 (GBP)
Organisation Innovate UK 
Sector Public
Country United Kingdom
Start 03/2020 
End 05/2020
 
Description UCLM 
Organisation University of Castile-La Mancha
Country Spain 
Sector Academic/University 
PI Contribution Providing expertise on computer vision and machine learning and data for use-case studies Submission of several conference and journal papers Contribution to several H2020 EU submission
Collaborator Contribution Providing expertise on artificial intelligence, common-sense reasoning and hardware implementation Submission of several conference and journal papers Contribution to several H2020 EU submission and other projects
Impact Materialisation of the joint research on the Spanish-funded research project "SymbIoT: A Symbiotic IoT Ecosystem for Smart Environments" Research papers: J. Martinez, M.J. Santofimia, X. del Toro, J. Barba, F. Romero, P. Navas, J.C. Lopez, "Non-linear classifiers applied to EEG analysis for epilepsy seizure detection", in Expert Systems with Applications, vol. 86, pp. 99-112, 2017. M.J. Santofimia, J. Martinez, X. Hong, H. Zhou, P. Miller, D. Villa, J.C. Lopez, "Hierarchical Task Network Planning with Common-Sense Reasoning for Multiple-People Behaviour Analysis", in Expert Systems with Applications, vol. 69, pp.118-137, 2017. X. Hong, Y. Huang, W. Ma, S. Varadarajan, P. Miller, W. Liu, M. J Romero, J. Martinez, H. Zhou, "Evidential Event Inference in Transport Video Surveillance", in Computer Vision and Image Understanding, 144, pp. 276-297, 2016. R. Cantarero, M. J. Santofimia, D. Villa, R. Requena, M. Campos, F. Florez, J-C. Nebel, J. Martinez, J. C Lopez, "Kinect and Episodic Reasoning for Human Action Recognition", in LNCS-Lecture Notes in Computer Science (13th International Conference on Distributed Computing and Artificial Intelligence (DCAI)), 2016.
Start Year 2014
 
Title Human Pose Estimation software demo 
Description Software for consumer-level webcam able to estimate the full human pose based of the research developed in this project 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact None yet 
URL https://github.com/niallmcl/Human-Pose-Estimation
 
Description Presentation at CSIT away day 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Presentation of our research and outcome at our Research center yearly meeting. It was also a good opportunity for the RA to establish himself within the centre.
Year(s) Of Engagement Activity 2018
 
Description Presentation at a Research seminar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Presentation of this research as a new lecturer at Queen's university by the RA of this project
Year(s) Of Engagement Activity 2018
 
Description Presentation of research outcome to BT 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Presentation fo current capabilities to BT for possible deployment on Belfast Harbour
Year(s) Of Engagement Activity 2021
 
Description Presentation of research outcome to Boeing 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Presentation of current capabilities to Boeing for future engagements
Year(s) Of Engagement Activity 2021
 
Description Presentation of results at CSIT Industrial Advisory Board 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Outcomes, quantitative results and progress is presented every 3 months to CSIT Industrial Advisory Board. This board include and expert panel composed of member companies such as BAE systems, Thales, Allstate, Seagate and many others, as well as some government representatives such as EPSRC, GCHQ, Home Office, etc..
Year(s) Of Engagement Activity 2017,2018,2019
URL http://www.csit.qub.ac.uk/Collaboration/Membership-Programme-Benefits/
 
Description Presentation of results at Defence Science and Technology Laboratory (DSTL) 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Outcomes, quantitative results and progress were presented to DSTL.
Year(s) Of Engagement Activity 2018
 
Description Workshop on Semantic event recognition for BAE systems 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Workshop on event recognition and pose estimation for BAE systems
Year(s) Of Engagement Activity 2020