DEFORM: Large Scale Shape Analysis of Deformable Models of Humans
Lead Research Organisation:
Imperial College London
Department Name: Computing
Abstract
Recently, computer vision is witnessing a paradigm shift. Standard robust features, such as Scale Invariant Feature Transform (SIFT), Histogram of Oriented Gradienst (HoGs), etc., are replaced by learnable filters via the application of Deep Convolutional Neural Networks (DCNNs). Furthermore, for applications (e.g., detection, tracking, recognition, etc.) that involve deformable objects, such as human bodies/faces/hands etc., traditional statistical or physics-based deformable models are combined with DCNNs with very good results. The current progress is made due to the abundance of complex visual data in the Big Data era, spread mostly through the Internet via web services such as Youtube, Flickr, and Google Images. The latter has led to the development of huge databases (such as ImageNet, Microsoft COCO, and 300W, etc.) consisting of visual data captured "in-the wild". Furthermore, the scientific and industrial community has undertaken large-scale annotation tasks. For example, me and my group have made huge efforts to annotate over 30K facial images and 500K video frames with regards to a large number of facial landmarks. The COCO team has annotated thousands of body images with regards to body joints, etc. All the above annotations generally refer to a set of sparse parts of objects and/or their segments, which can be annotated by humans (e.g., through crowd sourcing). In order to make the next step in automatic understanding of a scene in general, and humans and their actions, in particular, the community needs to acquire 3D dense information. Even though the collection of 2D intensity images is now a relatively easy and inexpensive process, the collection of high-resolution 3D scans of deformable objects, such as humans and their (body) parts, still remains an expensive and laborious process. This is the principal reason why very limited efforts have been made in collecting large-scale databases of 3D faces, heads, hands, bodies, etc.
In DEFORM, I propose to perform large-scale collection of high-resolution 4D sequences of humans. Furthermore, I propose new lines of research in order to provide high quality annotations regarding the correspondences between the 2D intensity "in-the-wild" images and the dense 3D structure of deformable objects' shapes and in particular of humans and their parts. Establishing dense 2D-to-3D correspondences can effortlessly solve many image-level tasks such as landmark (part) localisation, dense semantic part segmentation, estimation of deformations (i.e., behaviour), etc.
In DEFORM, I propose to perform large-scale collection of high-resolution 4D sequences of humans. Furthermore, I propose new lines of research in order to provide high quality annotations regarding the correspondences between the 2D intensity "in-the-wild" images and the dense 3D structure of deformable objects' shapes and in particular of humans and their parts. Establishing dense 2D-to-3D correspondences can effortlessly solve many image-level tasks such as landmark (part) localisation, dense semantic part segmentation, estimation of deformations (i.e., behaviour), etc.
Planned Impact
The impact of the DEFORM technology will be enormous, as it will enable the creation of important new applications of ICT in basic research, medicine, healthcare/bioengineering, wearable devices, robotics, virtual/augmented reality (VR/AR), digital economy and business, to name a few. More precisely, the impact of DEFORM spans many different fields, including, but not limited to:
- Computer Vision and Machine learning: the algorithms and statistical models developed in DEFORM can revolutionalise automatic analysis and understanding of humans in images and videos.
- VR, graphics and computer games: the statistical models of human face/body/hand shape and texture can be used for creating huge amounts of realistic human models for populating VR worlds and games (currently the cost of creating content for VR applications is one the reasons impeding progress in the field).
- Medicine, anthropology and forensics: the statistical models can be used to create normative statistical distributions for bodies and hands. In order to maximise the impact of the collected data a clinician will be involved in data collection.
-Bio-engineering, wearables and prosthetics: the statistical models of the 3D shape of bodies and hands can be used to design personalised prosthetic parts and wearable devices.
The research programme of DEFORM provide excellent opportunities for public engagement. That is, the database collection in Science Museum London (SML) will give my research team an opportunity to interact with thousands of people and provide them with a clear understanding of the uses and limitations of the technology. My team will also record the views, ideas and concerns of the public regarding the use of technologies relevant to DEFORM. A dynamic website will host the research and data. Its sections and social media (e.g. dedicated twitter feed) will be directed at non-scientists. Team members will regularly contribute to the website's blog, twitter feed and podcast to explain their work. I will exploit outreach opportunities for face-to-face engagement such as the British Science Festival, and the Royal Society Summer Science Exhibition, providing training for researchers as needed
We believe that the technology developed in this project has very high potential for commercialisation. In particular, the developed statistical models of high-resolution 3D bodies, hand and faces could be licensed to industries working in computer vision, graphics, VR, AR and movies post-production. I have already extensive experience in close collaborations with industry and licensing outcomes of research. I will use our previous experiences to work in collaboration with industry to exploit opportunities for commercialisation of the developed technology. The industrial project partners will also help in this direction. To ensure the potential for commercial exploitation I will protect the developed IP where appropriate (e.g. via patents, if and when appropriate, before dissemination to the community).
- Computer Vision and Machine learning: the algorithms and statistical models developed in DEFORM can revolutionalise automatic analysis and understanding of humans in images and videos.
- VR, graphics and computer games: the statistical models of human face/body/hand shape and texture can be used for creating huge amounts of realistic human models for populating VR worlds and games (currently the cost of creating content for VR applications is one the reasons impeding progress in the field).
- Medicine, anthropology and forensics: the statistical models can be used to create normative statistical distributions for bodies and hands. In order to maximise the impact of the collected data a clinician will be involved in data collection.
-Bio-engineering, wearables and prosthetics: the statistical models of the 3D shape of bodies and hands can be used to design personalised prosthetic parts and wearable devices.
The research programme of DEFORM provide excellent opportunities for public engagement. That is, the database collection in Science Museum London (SML) will give my research team an opportunity to interact with thousands of people and provide them with a clear understanding of the uses and limitations of the technology. My team will also record the views, ideas and concerns of the public regarding the use of technologies relevant to DEFORM. A dynamic website will host the research and data. Its sections and social media (e.g. dedicated twitter feed) will be directed at non-scientists. Team members will regularly contribute to the website's blog, twitter feed and podcast to explain their work. I will exploit outreach opportunities for face-to-face engagement such as the British Science Festival, and the Royal Society Summer Science Exhibition, providing training for researchers as needed
We believe that the technology developed in this project has very high potential for commercialisation. In particular, the developed statistical models of high-resolution 3D bodies, hand and faces could be licensed to industries working in computer vision, graphics, VR, AR and movies post-production. I have already extensive experience in close collaborations with industry and licensing outcomes of research. I will use our previous experiences to work in collaboration with industry to exploit opportunities for commercialisation of the developed technology. The industrial project partners will also help in this direction. To ensure the potential for commercial exploitation I will protect the developed IP where appropriate (e.g. via patents, if and when appropriate, before dissemination to the community).
People |
ORCID iD |
Stefanos Zafeiriou (Principal Investigator / Fellow) |
Publications
Alexandridis K
(2022)
Inverse Image Frequency for Long-tailed Image Recognition
Alexandridis KP
(2023)
Inverse Image Frequency for Long-Tailed Image Recognition.
in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Bahri M
(2021)
Shape My Face: Registering 3D Face Scans by Surface-to-Surface Translation
in International Journal of Computer Vision
Bahri M
(2021)
Binary Graph Neural Networks
Chrysos G
(2020)
RoCGAN: Robust Conditional GAN
in International Journal of Computer Vision
Chrysos G
(2020)
P-nets: Deep Polynomial Neural Networks
Chrysos G
(2019)
PolyGAN: High-Order Polynomial Generators
Description | We have developed the first large-scale statistical models for the human head and face. We have developed methodologies for face recognition that were made publicly available and are in the leaderboard of NIST (https://pages.nist.gov/frvt/html/frvt11.html). We have developed methods for synthesizing faces. We have started a large-scale data collection in the Science Museum of London for hands and bodies. From this collection we have developed large scale statistical models for body and hand which have been presented in top venues. Insightface (https://insightface.ai/) publicly available software for face analysis has been released and curated now by the community. The github repository has received over 20K stars and has been downloaded by over 100K people. The handy hand model was made publicly available (https://github.com/rolpotamias/handy). It is now used by hundreds of researchers worldwide. A large scale head and face models was provided publicly available (https://github.com/steliosploumpis/Universal_Head_3DMM). A tongue model and data was provided publicly available (https://github.com/steliosploumpis/tongue#public-release-tongue-dataset). A generative model of facial texture was provided publicly available (https://github.com/barisgecer/TBGAN). |
Exploitation Route | The code has been made publicly available and is now used by many practitioners (20K stars). The paper that describes the work has been already received over 6000K citations even though published 8 months ago. The models made publicly available are used by over 100K users. |
Sectors | Digital/Communication/Information Technologies (including Software) |
URL | https://insightface.ai/ |
Description | During the project Imperial College London collaborated with Ariel AI Inc/Ltd within the EPSRC project DEFORM. Ariel AI Inc/Ltd licensed certain technology that was developed in the project. Ariel AI Inc/Ltd has been recently acquired by Snap Inc [1]. Using the developed technology a new kit for 3D body and hand tracking has been released by Snap Inc [2]. [1] https://www.cnbc.com/2021/01/26/snap-acquires-ariel-ai-to-boost-snapchat-augmented-reality-features.html [2] https://www.linkedin.com/posts/iasonas-kokkinos-1a4593157_augmentedreality-snapchat-activity-6771000735577448448-Iyan/ |
First Year Of Impact | 2020 |
Sector | Creative Economy,Digital/Communication/Information Technologies (including Software) |
Impact Types | Societal Economic |
Company Name | Ariel AI |
Description | Ariel AI develops 3D modelling software for mobile phones. |
Year Established | 2018 |
Impact | The company was co-founded by many members of DEFORM project. The company has licensed technology developed in the EPSRC DEFORM project and has been recently acquired by Snap Inc.[1] [1] https://www.cnbc.com/2021/01/26/snap-acquires-ariel-ai-to-boost-snapchat-augmented-reality-features.html |
Website | http://www.arielai.com |