Adaptive Facial Deformable Models for Tracking (ADAManT)

Lead Research Organisation: Imperial College London
Department Name: Computing

Abstract

We propose to develop methodologies for automatic construction of person-specific facial deformable models for robust tracking of facial motion in unconstrained videos (recorded 'in-the-wild'). The tools are expected to work well for data recorded by a device as cheap as a web-cam and in almost arbitrary recording conditions. The technology that will be developed in the project is expected to have a huge impact in many different applications including but not limited to, biometrics (face recognition), Human Computer Interaction (HCI) systems, as well as, analysis and indexing of videos using facial information (e.g., YouTube), capturing of facial motion in games and film industry, creating virtual avatars, just to name a few.

The novelty of the ADAMant technology is multi-faceted. We propose the very first, robust, discriminative deformable facial models that can be customized, in an incremental fashion, so that they can automatically tailor themselves to the person's face using image sequences under uncontrolled recording conditions (both indoors and outdoors). Also, we propose to build and publicly release the first annotated, with regards to facial landmarks, database of facial videos made 'in-the-wild'. Finally, we aim to use the database as the base of the first competition for facial landmark tracking 'in-the-wild', which will run as a satellite workshop of a top vision venue (such as ICCV 2015).

As a proof of concept, and with a focus on a novel application, the ADAMant technology will be applied for (1) facial landmark tracking for machine analysis of behaviour in response to product adverts watched by people at comfort of their home (indoors) and (2) facial landmark tracking for automatic face verification using videos recorded by mobile devices (outdoors). In an increasingly global economy and ever-ubiquitous digital age, the market can change rapidly. As stipulated by the UK Researcher Councils' Digital Economy Theme, realising substantial transformational impact on how new business models are being created and taking advantage of the digital world is one of the main challenges. As human face is at the heart of many scientific disciplines and business models, the ADAManT project provides technology that can reshape established business models to become more efficient but also create new ones. Within EPSRC's ICT priorities our research is extremely relevant to autonomous systems and robotics, since it enables the development of robots capable of understanding human behaviour in unconstrained environments (i.e., design of robot companions, robots as tourist guide, etc.).

Planned Impact

Technologies that can robustly and accurately track facial motion as observed by omnipresent webcams in digital devices would have profound impact on both basic sciences and industrial sector. They would open up important new applications of ICT in basic research, medicine, healthcare, digital economy and business.
(a) basic research - application of computer vision tools for automatic tracking of facial signals in the wild could open up tremendous potential to measure behaviour indicators (in psychology, psychiatry, security sector, etc.) that heretofore resisted measurement because they were too subtle or fleeting to be measured by the human eye.
(b) computer science - affective interfaces, interactive games, implicit-tagging of multimedia content, and various online services would all be enabled or enhanced by this novel technology, effectively leading to the development of the next generation of human-centered computing; furthermore the developed technologies could have a huge impact in movies (special effects) and games industry which still perform facial motion capture with tedious manual procedures or require from the actors to wear special equipment which affect their acting performance; finally the developed tracking methodologies will facilitate the development of robust mobile biometric systems.
(b) robotics: the development of automated tools for tracking of facial motion in unconstrained conditions would enable the development of robots capable of understanding human behavior in both indoor and outdoor environments (i.e., design of robot companions, robots as tourist guide, etc.).
(d) medicine and health care - many disorders in neurology and psychiatry (schizophrenia, suicidal depression, Parkinson's disease) involve aberrations in display and interpretation of facial behaviour. ADAManT technology could provide tracked facial motion with increased reliability, sensitivity, and precision needed to explore the relationship between facial behaviour and mental disorder and lead to new insights and remote diagnostic methods. Also, remote monitoring of conditions like pain and depression, remote assessment of drug effectiveness, etc., would be possible, leading to more advanced personal wellness technologies.
(e) digital economy and commercial applications - Automatic measurement of consumers' liking in response to product ads will have profound impact in automatic market research analysis, mainly because this will offer the possibility of conducting massive market-research studies. This is in striking contrast to standard market research methods for collecting liking ratings that are based on self-reporting techniques, which are known to be notoriously slow, error prone, and tedious. The ADAManT technology will be of great value not only for the companies working in market research analysis (such as RealEyes) but it will extremely valuable to their clients too. This is exactly in the line with challenges listed in the UK Researcher Councils' Digital Economy Theme -- realising substantial transformational impact on how new business models are being created and taking advantage of the digital age. The ADAManT technology will also make possible automatic assessment of the driver's stress level, detection of micro sleeps, spotting driver's puzzlement, and effectively enabling next generation of in-vehicle assistive technology. It will facilitate automatic assessment of student's interest level, puzzlement, and enjoyment in online- and E-education, enabling the development of truly intelligent tutoring systems.

Publications

10 25 50
publication icon
Alabort-I-Medina J (2017) A Unified Framework for Compositional Fitting of Active Appearance Models. in International journal of computer vision

publication icon
Antonakos E (2015) Feature-based Lucas-Kanade and active appearance models. in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

publication icon
Antonakos E (2015) Active Pictorial Structures

publication icon
Asthana A (2015) From Pixels to Response Maps: Discriminative Image Filtering for Face Alignment in the Wild. in IEEE transactions on pattern analysis and machine intelligence

publication icon
Chrysos GG (2018) PD2T: Person-Specific Detection, Deformable Tracking. in IEEE transactions on pattern analysis and machine intelligence

publication icon
Chrysos GG (2018) A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild". in International journal of computer vision

 
Description During this award we developed a semi-automatic methodology for cost and time efficient annotation of facial video frames. The methodology has been used to annotate all the frames of 110+ facial videos (1+min per video) with regards facial landmarks (over 150,000 frames). The annotations have been used to develop the first benchmark for facial landmark tracking, as well as to run the first challenge on the topic (http://ibug.doc.ic.ac.uk/resources/300-VW/).
Exploitation Route The benchmark will have huge impact in the field and beyond. Currently, the data are used by the majority of the academic community, as well as the industry working on face analysis of videos. The semi-automatic annotation methodology developed would have impact in analysis of other deformable objects (e.g., human body), since it makes annotation of large amount of images an efficient task. Parts of the methodology is currently open source and publicly available in the menpo project (http://www.menpo.org/).
Sectors Digital/Communication/Information Technologies (including Software)

URL http://ibug.doc.ic.ac.uk/resources/300-VW/
 
Description A facial landmark tracking software developed through this awarded was perpetually licenced to the company SeeingMachines (http://www.seeingmachines.com/).
First Year Of Impact 2015
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Title 300 VW Facial Landmark Tracking Bechmark/Database 
Description 300 VW is the first benchmark for facial landmark tracking. It contains more than 150,000 of annotated facial frames (annotated with regards 60+ facial landmarks). 
Type Of Material Database/Collection of data 
Year Produced 2015 
Provided To Others? Yes  
Impact The database has been used in the first challenge/competition for facial landmark tracking (held in conjunction with on of the top computer vision venues, i.e. ICCV, 2015). Currently, the database/benchmark is used by the state-of-the-art for assessing the performance of facial landmark tracking methodologies. 
URL http://ibug.doc.ic.ac.uk/resources/300-VW/
 
Title Facial landmark tracking 
Description A real time facial landmark tracking software. The software tracks in real-time 60+ facial landmarks in arbitrary monocular intensity facial videos. The software has been licenced, perpetually, to SeeingMachines (http://www.seeingmachines.com/). 
IP Reference  
Protection Copyrighted (e.g. software)
Year Protection Granted 2015
Licensed Yes
Impact To the best of my knowledge the licensed technology is the current facial landmark tracker used by SeeingMachines.
 
Title Menpo Project: Parametric Image alignment and deformable model construction and fitting. 
Description The Menpo Project, hosted at http://www.menpo.io, is a BSD-licensed software platform providing a complete and comprehensive solution for annotating, building, fitting and evaluating deformable visual models from image data. Menpo is a powerful and flexible cross-platform framework written in Python that works on Linux, OS X and Windows. Menpo makes it easy to understand and evaluate the above complex algorithms, providing tools for visualisation, analysis, and performance assessment. A key challenge in building deformable models is data annotation; Menpo expedites this process by providing a simple web-based annotation tool hosted at http://www.landmarker.io. The Menpo Project is thoroughly documented and provides extensive examples for all of its features. We believe the project is ideal for researchers, practitioners and students alike. 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact The Menpo project has been used to develop the annotations of the very popular 300VW and 300W (second version) benchmarks. 300VW is the first benchmark for facial landmark tracking methodologies and 300W is the most popular database for facial landmark localisation in arbitrary conditions (used by the majority of the state-of-the-art published in top computer vision venues, such as ICCV, CVPR, ECCV etc.). The menpo project has been demoed in ICCV 2015. Currently the menpo project has hundredths of users both in academia and industry. 
URL http://www.menpo.org/
 
Description Demo of Menpo Platform (open source platform for deformable model learning and fitting) in ICCV 2015 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact We presented our publicly available platform in demo in conjunction with one of the most prestigious computer vision venues, i.e., International Conference on Computer Vision, (ICCV) 2015. The demo attracted a lot of attention and our platform is now used my hundreds of users all over the world (both academic and industrial users).
Year(s) Of Engagement Activity 2015
URL http://www.menpo.org/