Multi-activity 3D pose estimation on real environments
Lead Research Organisation:
Queen's University Belfast
Department Name: Sch of Electronics, Elec Eng & Comp Sci
Abstract
Human-machine interfaces, video surveillance, sport performance enhancement, physical therapy, smart environments, to name a few, are important societal challenges that require better automatic behaviour analysis to be fully addressed. In order to move closer to the level of human proficiency, fully automatic understanding of a scene requires a whole range of capabilities: reliable extraction of each actor involved, its pose and their activities. This involves the combined application of pose estimation, multi target tracking and activity recognition. While impressive progress has been made in those fields in isolation, reliable methods, able to be applied to real world and unconstrained environments, are still a challenge. In this project we will focus on the intermediate components of behaviour analysis, by disregarding the traditional cascade pipeline, where pose estimation frequently plays a secondary role or it is completely obliterated due to its complexity, and proposing a novel architecture which has 3D pose estimation as the key central component with feedback between each of the other components.
In this project, we propose to investigate the automated 3D pose estimation and tracking of multiple people in realistic scenarios. This research is suggested on the basis that all current methods perform under strong limitations and assumptions that preclude their application to real-world situations. Thus, while some methods require multiple high-resolution sensors, thereby ruling out the use of current and near future sensor network infrastructures, others struggle with scenes containing multiple persons, or they succeed on the basis of the subjects not interacting and also knowing the activity performed beforehand. This last assumption reduces the practical application of the pose estimation and prevents it use for activity recognition and/or behavioural analysis.
To address this limitation, in this project we propose to extend the assumption from one of a single known activity as prior model, to one where a class of multiple activities is assumed, e.g., walking, running, fighting, shaking hands etc. This requires us to develop a novel multi-activity model that could be used as prior information to accurately and robustly estimate the 3D pose under complex and real world conditions. This multi activity model will avoid presuming the performed activity by each of the subject in the scene among the given set of activities. The development and use of such a model is the key novel contribution of this proposal, and is a first step towards a fully activity-agnostic 3D pose estimation for real environments.
Furthermore, we propose a paradigm change to the conventional behaviour analysis chain, where pose estimation becomes the cornerstone of the system, and the feedback loops with tracking, to address occlusions and interactions, and activity recognition, to switch between a set of plausible activities during the estimation, allows us to deal with the aforementioned issues. By modelling transitions between this set of activities, and observing how predicted poses propagate in time through the activity space, the current activity can be recognised and used as feedback for refining the pose estimation. This is the second novelty of this proposal. Lastly, inaccuracies in the pose estimation, caused by occlusion and multiple persons interacting, can be overcome by using information from the tracker to determine image regions that provide reliable pose estimation information. Similarly, by knowing the pose and activity of subjects in the scene, the tracking performance can be improved. This is the third novel aspect of the proposal.
In this project, we propose to investigate the automated 3D pose estimation and tracking of multiple people in realistic scenarios. This research is suggested on the basis that all current methods perform under strong limitations and assumptions that preclude their application to real-world situations. Thus, while some methods require multiple high-resolution sensors, thereby ruling out the use of current and near future sensor network infrastructures, others struggle with scenes containing multiple persons, or they succeed on the basis of the subjects not interacting and also knowing the activity performed beforehand. This last assumption reduces the practical application of the pose estimation and prevents it use for activity recognition and/or behavioural analysis.
To address this limitation, in this project we propose to extend the assumption from one of a single known activity as prior model, to one where a class of multiple activities is assumed, e.g., walking, running, fighting, shaking hands etc. This requires us to develop a novel multi-activity model that could be used as prior information to accurately and robustly estimate the 3D pose under complex and real world conditions. This multi activity model will avoid presuming the performed activity by each of the subject in the scene among the given set of activities. The development and use of such a model is the key novel contribution of this proposal, and is a first step towards a fully activity-agnostic 3D pose estimation for real environments.
Furthermore, we propose a paradigm change to the conventional behaviour analysis chain, where pose estimation becomes the cornerstone of the system, and the feedback loops with tracking, to address occlusions and interactions, and activity recognition, to switch between a set of plausible activities during the estimation, allows us to deal with the aforementioned issues. By modelling transitions between this set of activities, and observing how predicted poses propagate in time through the activity space, the current activity can be recognised and used as feedback for refining the pose estimation. This is the second novelty of this proposal. Lastly, inaccuracies in the pose estimation, caused by occlusion and multiple persons interacting, can be overcome by using information from the tracker to determine image regions that provide reliable pose estimation information. Similarly, by knowing the pose and activity of subjects in the scene, the tracking performance can be improved. This is the third novel aspect of the proposal.
Planned Impact
This research is key to the development of next generation video analytics and surveillance systems. These systems should be able to mimic human performance, thereby ensuring enhanced situation awareness and leading to timely decision making. This proposal seeks to take advantage existing surveillance camera infrastructure, which translates into a reduced cost for companies and the tax payer. The societal impact is two-fold: detecting suspicious behaviour in real-time to prevent crime, and provide forensic evidence that can lead to increased convictions
Next generation video analytics will be key for the protection of Critical National Infrastructures, whose protection will become a major societal concern in the new smart society. Our proposal will enable automatic monitoring and analysis of the behaviour of staff and intruders, to reduce the potential damage and injuries to the staff and to the infrastructure. This is of particular interest for the national security of UK citizens, where the Centre for the Protection of National Infrastructure has identified terrorism as a severe threat to vital services, and has named physical security as 1 of the 3 main disciplines that need to be improved, with appropriate investment in CCTV and intruder alarms. The UK Government has also committed £860m to its national Cyber Security Programme. CSIT, with the Home office, PSNI and GCHQ as industrial advisory board members and collaborators in previous projects, is the perfect environment for this research to achieve this impact
Britain's aging population, the small average family size and a wide geographical dispersal, leave many elderly citizens isolated, with serious consequences for physical and mental health. UK Department of Health expects the number of elderly people over 65 to grow by 51% by 2030. This is currently a key issue in Parliament and has had an impact on the NHS, where a national target was set to increase the number of older people living at home. Assistive technology has shown its value in helping people live independently, improving the quality of life for users and carers, but it has not yet been used to its full potential. Our research will enable the detection of subtle differences between dangerous and normal behaviour while using affordable video sensors, improve the automatic and remote monitoring, increasing the coverage and fast response, while avoiding an increase in cost
Exploiting innovative technology such as this will increase the UK's competitive position both within Europe and on the global stage. The UK Security Market is expected to grow from £2.8bn in 2014 to £3.4bn by 2017 (Competitive Analysis of the UK Cyber Security Sector, PAC, 2013). As a main path, the outcomes of this research will be commercialised though a CSIT spin-off company, Cognition Video. Cognition business model is based on supplying advance analytics to Physical Security Information Management systems and provides. PSIMs provide a platform designed to integrate multiple unconnected security applications in order to identify security breach situations. The outcomes of this research will be added into the current system to provide cutting-edge capabilities that will allow the PSIM company to compete, to grow and to differentiate from the rest of the market, since current PSIM rely on basic analytics, creating new employments and economic gains.
Results will be also disseminated to the CSIT industrial advisory board, which includes major companies in the security sector, such as Thales, BAE and Roke Manor. CSIT also holds regular meetings, summits and white papers, integrating researchers with end users. While taking advantage of CSIT's position as the UK's Innovation and Knowledge Centre for cyber security for maximising impact, this research will also increase the reputation of CSIT as a global innovation hub, enhancing our ability to attract further investment from industry and public sector funds in Europe and beyond
Next generation video analytics will be key for the protection of Critical National Infrastructures, whose protection will become a major societal concern in the new smart society. Our proposal will enable automatic monitoring and analysis of the behaviour of staff and intruders, to reduce the potential damage and injuries to the staff and to the infrastructure. This is of particular interest for the national security of UK citizens, where the Centre for the Protection of National Infrastructure has identified terrorism as a severe threat to vital services, and has named physical security as 1 of the 3 main disciplines that need to be improved, with appropriate investment in CCTV and intruder alarms. The UK Government has also committed £860m to its national Cyber Security Programme. CSIT, with the Home office, PSNI and GCHQ as industrial advisory board members and collaborators in previous projects, is the perfect environment for this research to achieve this impact
Britain's aging population, the small average family size and a wide geographical dispersal, leave many elderly citizens isolated, with serious consequences for physical and mental health. UK Department of Health expects the number of elderly people over 65 to grow by 51% by 2030. This is currently a key issue in Parliament and has had an impact on the NHS, where a national target was set to increase the number of older people living at home. Assistive technology has shown its value in helping people live independently, improving the quality of life for users and carers, but it has not yet been used to its full potential. Our research will enable the detection of subtle differences between dangerous and normal behaviour while using affordable video sensors, improve the automatic and remote monitoring, increasing the coverage and fast response, while avoiding an increase in cost
Exploiting innovative technology such as this will increase the UK's competitive position both within Europe and on the global stage. The UK Security Market is expected to grow from £2.8bn in 2014 to £3.4bn by 2017 (Competitive Analysis of the UK Cyber Security Sector, PAC, 2013). As a main path, the outcomes of this research will be commercialised though a CSIT spin-off company, Cognition Video. Cognition business model is based on supplying advance analytics to Physical Security Information Management systems and provides. PSIMs provide a platform designed to integrate multiple unconnected security applications in order to identify security breach situations. The outcomes of this research will be added into the current system to provide cutting-edge capabilities that will allow the PSIM company to compete, to grow and to differentiate from the rest of the market, since current PSIM rely on basic analytics, creating new employments and economic gains.
Results will be also disseminated to the CSIT industrial advisory board, which includes major companies in the security sector, such as Thales, BAE and Roke Manor. CSIT also holds regular meetings, summits and white papers, integrating researchers with end users. While taking advantage of CSIT's position as the UK's Innovation and Knowledge Centre for cyber security for maximising impact, this research will also increase the reputation of CSIT as a global innovation hub, enhancing our ability to attract further investment from industry and public sector funds in Europe and beyond
People |
ORCID iD |
Jesus Martinez Del Rincon (Principal Investigator) |
Publications
McLaughlin N
(2019)
Video Person Re-Identification for Wide Area Tracking Based on Recurrent Neural Networks
in IEEE Transactions on Circuits and Systems for Video Technology
McLaughlin N
(2022)
3-D Human Pose Estimation Using Iterative Conditional Squeeze and Excitation Networks.
in IEEE transactions on cybernetics
Lennox, M.
(2022)
Visual Re-identification within Large Herds of Holstein Friesian Cattle
Description | -Individual activity models as well as multiactivity models are useful to improve pose estimation in complicated scenes. -A new neural network training mechanism to improve pose estimation -Combination of 2D and 3D pose estimation models are useful for getting more robust models in real life-environments |
Exploitation Route | The new learning mechanism for neural networks can be used to further improve machine learning based on deep learning in many applications. Use of activity-specific models to refine pose estimation. Use of partial or incomplete data to improve pose estimation in real environments |
Sectors | Digital/Communication/Information Technologies (including Software) Leisure Activities including Sports Recreation and Tourism Security and Diplomacy |
Description | Findings in this research are being evaluated to perform pose estimation analysis on animals (chickens and cattle) applied to animal welfare. Our approach allows a continuos and automatic pose estimation that can be used to calculated an effective mobility score on the animals, cheking anomalies in their behaviour that relate to injures or limping. As main advantage regarding traditional visual inspection, this is effortless for the farmer and cost effective since cheap monocular cameras can be used in the farm to automatise the process. |
First Year Of Impact | 2019 |
Sector | Agriculture, Food and Drink,Healthcare |
Impact Types | Societal Economic |
Description | CSIT 2 |
Amount | £5,532,504 (GBP) |
Funding ID | EP/N508664/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 03/2015 |
End | 03/2022 |
Description | Capital Award emphasising support for Early Career Researchers |
Amount | £150,000 (GBP) |
Funding ID | EP/S017682/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 02/2019 |
End | 07/2021 |
Description | FlockFocus Phase II |
Amount | £693,098 (GBP) |
Funding ID | FFAR - 21-000053 |
Organisation | Foundation for Food and Agriculture Research |
Sector | Charity/Non Profit |
Country | United States |
Start | 05/2022 |
End | 11/2023 |
Description | Flockfocus - Developing automated surveillance tools to safeguard chicken welfare |
Amount | £221,955 (GBP) |
Funding ID | FFAR Sb-0000000021 - Flockfocus |
Organisation | Foundation for Food and Agriculture Research |
Sector | Charity/Non Profit |
Country | United States |
Start | 03/2020 |
End | 09/2021 |
Description | Knowledge Transfer Partnerships Ref: KTP 12165 (Queen's University Belfast, Cattle Eye Limited) |
Amount | £171,801 (GBP) |
Funding ID | KTP 12165 |
Organisation | Cattle Eye Ltd |
Sector | Private |
Country | United Kingdom |
Start | 08/2020 |
End | 09/2022 |
Description | SymbIoT: A Symbiotic IoT Ecosystem for Smart Environments |
Amount | € 173,771 (EUR) |
Funding ID | SBPLY/17/180501/000334 |
Organisation | European Union |
Sector | Public |
Country | European Union (EU) |
Start | 08/2018 |
End | 08/2021 |
Description | Video-based semantic analysis for on crowded rail station |
Amount | £133,201 (GBP) |
Organisation | Innovate UK |
Sector | Public |
Country | United Kingdom |
Start | 03/2020 |
End | 05/2020 |
Description | UCLM |
Organisation | University of Castile-La Mancha |
Country | Spain |
Sector | Academic/University |
PI Contribution | Providing expertise on computer vision and machine learning and data for use-case studies Submission of several conference and journal papers Contribution to several H2020 EU submission |
Collaborator Contribution | Providing expertise on artificial intelligence, common-sense reasoning and hardware implementation Submission of several conference and journal papers Contribution to several H2020 EU submission and other projects |
Impact | Materialisation of the joint research on the Spanish-funded research project "SymbIoT: A Symbiotic IoT Ecosystem for Smart Environments" Research papers: J. Martinez, M.J. Santofimia, X. del Toro, J. Barba, F. Romero, P. Navas, J.C. Lopez, "Non-linear classifiers applied to EEG analysis for epilepsy seizure detection", in Expert Systems with Applications, vol. 86, pp. 99-112, 2017. M.J. Santofimia, J. Martinez, X. Hong, H. Zhou, P. Miller, D. Villa, J.C. Lopez, "Hierarchical Task Network Planning with Common-Sense Reasoning for Multiple-People Behaviour Analysis", in Expert Systems with Applications, vol. 69, pp.118-137, 2017. X. Hong, Y. Huang, W. Ma, S. Varadarajan, P. Miller, W. Liu, M. J Romero, J. Martinez, H. Zhou, "Evidential Event Inference in Transport Video Surveillance", in Computer Vision and Image Understanding, 144, pp. 276-297, 2016. R. Cantarero, M. J. Santofimia, D. Villa, R. Requena, M. Campos, F. Florez, J-C. Nebel, J. Martinez, J. C Lopez, "Kinect and Episodic Reasoning for Human Action Recognition", in LNCS-Lecture Notes in Computer Science (13th International Conference on Distributed Computing and Artificial Intelligence (DCAI)), 2016. |
Start Year | 2014 |
Title | Human Pose Estimation software demo |
Description | Software for consumer-level webcam able to estimate the full human pose based of the research developed in this project |
Type Of Technology | Webtool/Application |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | None yet |
URL | https://github.com/niallmcl/Human-Pose-Estimation |
Description | Deepdive with Johnson Control |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Industry/Business |
Results and Impact | Presentation of our computer vision capabilities to Johnson Control for potential licencing and collaboration |
Year(s) Of Engagement Activity | 2022 |
Description | Presentation at CSIT away day |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Other audiences |
Results and Impact | Presentation of our research and outcome at our Research center yearly meeting. It was also a good opportunity for the RA to establish himself within the centre. |
Year(s) Of Engagement Activity | 2018 |
Description | Presentation at a Research seminar |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Other audiences |
Results and Impact | Presentation of this research as a new lecturer at Queen's university by the RA of this project |
Year(s) Of Engagement Activity | 2018 |
Description | Presentation of research outcome to BT |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | Presentation fo current capabilities to BT for possible deployment on Belfast Harbour |
Year(s) Of Engagement Activity | 2021 |
Description | Presentation of research outcome to Boeing |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Presentation of current capabilities to Boeing for future engagements |
Year(s) Of Engagement Activity | 2021 |
Description | Presentation of results at CSIT Industrial Advisory Board |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | Outcomes, quantitative results and progress is presented every 3 months to CSIT Industrial Advisory Board. This board include and expert panel composed of member companies such as BAE systems, Thales, Allstate, Seagate and many others, as well as some government representatives such as EPSRC, GCHQ, Home Office, etc.. |
Year(s) Of Engagement Activity | 2017,2018,2019 |
URL | http://www.csit.qub.ac.uk/Collaboration/Membership-Programme-Benefits/ |
Description | Presentation of results at Defence Science and Technology Laboratory (DSTL) |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Policymakers/politicians |
Results and Impact | Outcomes, quantitative results and progress were presented to DSTL. |
Year(s) Of Engagement Activity | 2018 |
Description | Workshop on Semantic event recognition for BAE systems |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | Workshop on event recognition and pose estimation for BAE systems |
Year(s) Of Engagement Activity | 2020 |