Analysis of Facial Behaviour for Security in 4D (4D-FAB)

Lead Research Organisation: Imperial College London
Department Name: Computing

Abstract

The overall aim of the project is the development of automated tools for automatic spatio-temporal analysis and understanding of human subtle facial behaviour from 4D facial information (i.e. 3D high-quality video recordings of facial behaviour). Two exemplar applications related to security issues will be specifically addressed in this proposal: (a) person verification (i.e. using facial behaviour as a biometric trait), and (b) deception indication.

The importance of non-obtrusive person verification and deception indication is undisputable - every day, thousands of people go through airport security checkpoints, border crossing checkpoints, and other security screening points. Automated, unobtrusive monitoring and assessing of deceptive behaviour will form a valuable tool for end users, such as police, justice and prison services. This is in particular important as currently only informal interpretations for detecting deceptive behaviour are used. In addition, the development of alternative methods for person verification that are not based on physical traits only but on behavioural, easily observable traits like facial expressions, would be of great value for the development of multimodal biometric system. Such multi-modal biometric systems will be of great interested to government agencies such as the Home Office or the UK Border agency.

For automatic deception indication we propose to develop methodologies for detecting 4D micro-expressions and their dynamics being typical of deceptive behaviour as reported by research in psychology. For automatic person identification we propose to increase the robustness of static face- image-based verification systems by including facial dynamics as an additional biometric trait. The underlying motivation is that the dynamic 4D facial behaviour is very difficult to imitate and , hence, it has natural resilience against spoof attacks.

The project focuses on 3D video recordings rather than on 2D video recordings of facial behaviour due to two main reasons: (1) increased robustness to changes in head-pose, and (2) ability to spot subtle changes in the depth of facial surface such as jaw clench and tremor appearance on the cheeks, which are typical of deceptive behaviour and cannot be spotted in 2D images. The research on 3D facial dynamics is now made possible by the tremendous advance of sensors and devices for the acquisition of 3D face video recordings.

The core of the project will deal with both the development of 4D-FAB research platform containing tools for human subtle facial behaviour analysis in 4D videos and the development of annotated data repository consisting of two parts: (1) annotated 4D recordings of deceptive and truthful behavior, and (2) annotated 4D recordings of subjects uttering a sentence, deliberately displaying certain facial actions and expressions, and spontaneously displaying certain facial actions and expressions. The work plan is oriented around this central goal of developing 4D-FAB technology and is carried out in 3 work packages described in the proposal.

A team of 3 Research Associates (RAs), led by the PIs, and having the background in computer vision and machine learning, will develop 4D-FAB technology. The team will be closely assisted by 6 members of the Advisory Board:
Prof. Burgoon, University of Arizona, advising on psychology of deception and credibility
Prof. Cohn, Pittsburgh University / Carnegie Mellon University, advising on face perception and facial behaviometrics
Prof. Nunamaker, Director of BORDERS, US Nat'l Center for Border Security and Immigration, advising on making 4D-FAB useful for end users in security domain
Dr Hampson, Head of Science & Technology, OSCT, Home Office, advising on making 4D-FAB useful for end users
Dr Cohen, Director of United Technologies Research Centre Ireland, advising on making 4D-FAB useful for end users
Dr Urquhart, CEO of Dimensional Imaging, advising on 4D recording setup design

Planned Impact

The overall aim of the project is the development of automated tools for automatic spatio-temporal analysis and understanding of human facial behaviour from 4D facial information (i.e. 3D high-quality video recordings of facial behaviour). Two exemplar applications related to security issues will be specifically addressed in this proposal: (a) person verification (i.e. facial behaviour as a form of behaviometrics), and (b) deception indication.

The importance of non-obtrusive person verification and deception indication is undisputable - every day, thousands of people go through access control points and security screening checkpoints. Automated, unobtrusive person verification and assessing of deceptive behaviour will form a valuable tool for end users, such as police, justice and prison services. Such systems will be of great interest to government agencies such as the Home Office and Border Agency (both being project partners in this proposal).

While the proposal focuses on applications in the area of security, the technology developed will have numerous applications beyond this. Human behaviour understanding plays a critical role underlying the development and design of ICT systems in a human-centred manner, built for humans based on human behaviour models. Engineering ICT systems with the capability to sense and understand unstructured human user's behaviour is a challenge that goes beyond today's systems engineering paradigm, which can free computer users from the classic keyboard and mouse and enable technologies like ambient intelligence and ubiquitous computing. Other potential benefits from efforts to automate the analysis of facial expressions are varied and numerous and span fields as diverse as:
(1) cognitive sciences - automated tools would speed up tremendously current research processes as they would replace the current lengthy and tedious manual analysis of the studied behaviour.
(2) medicine - remote monitoring of conditions like pain and depression, remote assessment of drug effectiveness, computer-based remote treatment of facial paralysis, etc., would be possible, leading to more advanced personal wellness technologies than those available today.
(3) transportation - automatic assessment of the driver's stress level, detection of micro sleeps, and spotting driver's puzzlement, would be facilitated, enabling a next generation in-vehicle assisting technology.
(4) education - automatic assessment of student's interest level, puzzlement, and enjoyment would become possible, facilitating development of truly intelligent tutoring systems.

We believe that the technology developed in this project has very high potential for commercialization. In particular, the developed image acquisition technology will be of substantial interest beyond the area of facial expression analysis, e.g. in healthcare and creative industries. The PIs have already extensive experience in close collaborations with industry, in particular in the healthcare domain, as well as in setting up spin-off companies. We will use our previous experiences to work in collaboration with industry to exploit opportunities for commercialisation of the developed technology. To ensure the potential for commercial exploitation we will protect the developed IP where appropriate (e.g. via patents, if and when appropriate, before dissemination to the community).

To ensure appropriate involvement of end users in the proposed research we assembled an advisory team of potential users and industrial collaborators interested in the technology developed in this project. To ensure engagement with the wider community we also work in close collaboration with the Institute for Security Science and Technology at Imperial College London, that focuses on innovation in homeland and national security.

We will also disseminate our research to a wider audience through activities such as participation at the Royal Society Summer Exhibition, Meetings, etc.

Publications

10 25 50
publication icon
Trigeorgis G (2018) Deep Canonical Time Warping for Simultaneous Alignment and Representation Learning of Sequences. in IEEE transactions on pattern analysis and machine intelligence

publication icon
Trigeorgis G (2017) A Deep Matrix Factorization Method for Learning Attribute Representations. in IEEE transactions on pattern analysis and machine intelligence

publication icon
Trigeorgis G (2016) Deep Canonical Time Warping

publication icon
Tzimiropoulos G (2014) Active Orientation Models for Face Alignment In-the-Wild in IEEE Transactions on Information Forensics and Security

publication icon
Valstar M (2013) AVEC 2013

publication icon
Wang M (2018) Disentangling the Modes of Variation in Unlabelled Data. in IEEE transactions on pattern analysis and machine intelligence

publication icon
Wang Y (2018) Face Mask Extraction in Video Sequence in International Journal of Computer Vision

 
Description The key findings until now are the following:
(a) We have developed the first ever benchmarks for training and assessing the performance of facial landmark detection/tracking algorithms.
(b) We used the data to build the largest to date (over 10K people) generic 3D morphable model. We have also built models tailored to various groups (e.g., ethnicity, age etc.)
(c) We built methods for putting 3D faces in correspondence, which are currently used to put in correspondence thousands of 3D frames
(d) We used models for 3D face reconstruction in arbitrary conditions, as well as extraction of dense facial motion.
Exploitation Route The databases and models developed through the project are currently used by the majority of the state-of-the-art in facial landmark localisation and tracking.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Healthcare

URL http://ibug.doc.ic.ac.uk/resources
 
Description The database and publicly available software developed by 4DFAB are used by many start-up companies to build facial landmark localisation/tracking methodologies (these methodologies are the corner stone for many automatic facial analysis tasks, including, face recognition, facial expression recognition etc.).
First Year Of Impact 2014
Sector Creative Economy,Digital/Communication/Information Technologies (including Software)
Impact Types Societal

 
Title 300 VW: Annotated database for facial landmark tracking "in-the-wild" 
Description The database contains per-frame annotations of 110+ facial videos (1 min+ each) captured in unconstrained conditions. Each facial frame was annotated with regards to 68 landmarks (in total more than 150,000 frames). The database is used for assessing the performance of facial landmark tracking methodologies (or general deformable object tracking methodologies). 
Type Of Material Database/Collection of data 
Year Produced 2015 
Provided To Others? Yes  
Impact The database was used in the first benchmark for facial landmark tracking "in-the-wild". The benchmark was used to run the first competition/challenge in conjunction with one of the top venues in the field, i.e. ICCV 2015. The database is currently used by many state-of-the-art facial landmark localisation methods. 
URL http://ibug.doc.ic.ac.uk/resources/300-VW/
 
Title 300W: Database of Facial Landmarks "in-the-wild" 
Description The database provides the first collection of annotated, with regards to 68 landmarks, facial data in unconstrained conditions. The database contains more than 12,000 annotated images. 
Type Of Material Database/Collection of data 
Year Produced 2013 
Provided To Others? Yes  
Impact The database was used in the two first challenges in the field (the first held in conjuction with one of the top conferences of the field, i.e. ICCV 2013, and the second was held in a special issue of a journal, Image and Vision Computing). Currently 300W is the de-facto standard for assessing the performance of facial landmark localisation methodologies. It is used by the majority of the state-of-the-art methods reporting results in the best venues in the field (i.e., CVPR, ICCV, ECCV etc.). The paper describing the database has received more than 100 citations. 
URL http://ibug.doc.ic.ac.uk/resources/300-W_IMAVIS/
 
Title Large Scale Facial Model 
Description LSFM is the largest-scale 3D Morphable Model (3DMM) of facial shapes ever constructed, based on a dataset of around 10,000 distinct facial identities from a huge range of gender, age and ethnicity combinations. This model has been built using an especially-designed, fully automated system that accurately establishes dense correspondences among 3D facial scans and is robust to the large shape variability exhibited in human faces. LSFM includes not only a global 3DMM model but also models tailored for specific age, gender or ethnicity groups. This was made possible thanks to the extremely rich demographic information that the used dataset has. LSFM is built from two orders of magnitude more identity variation than current state-of-the-art models. Extensive experimental evaluations (Booth et al., CVPR'16) have shown that this additional training data leads to significant improvements in the characteristics of the statistical modelling of the 3D shape of human faces, and demonstrate that LSFM outperforms existing state-of-the-art models by a wide margin. 
Type Of Material Computer model/algorithm 
Year Produced 2017 
Provided To Others? Yes  
Impact The model is currently used by many hospitals worldwide for face reconstructive surgery planning. 
URL https://xip.uclb.com/i/healthcare_tools/LSFM.html
 
Title Menpo: Tools for 3D Morphable Model Construction 
Description The Menpo project provides publicly available open source tools for annotating and building 3D Morphable Models (3DMMs). In particular, it contains a very high quality and easy to use annotation tool, as well as many different methods for building 3DMMs (including non-ridig ICP routines, Active Appearance Models, warping etc.). 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact Menpo project was used to develop the first 3DMM built from 10,000 people. This is, to the best of our knowledge, the largest scale 3D Morphable Model ever constructed, containing statistical information from a huge variety of the human population (different age, gender and ethnicity groups). 
URL http://www.menpo.org/