3D Intrinsic Shape Recognition Under Deformation and View Changes

Lead Research Organisation: Imperial College London
Department Name: Electrical and Electronic Engineering

Abstract

In the proposal, we tackle the novel visual recognition problem of 3D (three-dimensional) deformable object shape identities or categories. Images of 3D objects undergo large appearance changes due to different object poses (articulation or deformation) as well as camera view-points. We attempt to recognise objects from single images by their 3D shape identities (or intrinsic shapes) regardless of their present poses and camera view-points. Humans can perceive 3D shapes of objects from single images, provided that they have previously seen 3D shapes of similar other objects. The knowledge formerly learnt on 3D shapes is called 3D shape prior. A key idea for fulfilling the proposed task is to learn and exploit the shape priors for object recognition.

The proposed research is well-lined with and goes beyond important topics of computer vision. Whereas much work for view-point invariant object recognition is limited to rigid object classes with bountiful textures, we consider deformable object shapes. In a series of work in the field of single view reconstruction, promising results have been shown for human body shape reconstruction under pose variations. There has also been a notable latest success in 3D human pose recognition. On the top of these results, we go beyond to capture 3D intrinsic shape variations for object recognition. The intended outcomes would benefit the relevant academic fields and their existing markets, and would also lead to potential new applications such as automatic monitoring of public obesity and animal tracking.

Planned Impact

As the relevant academic topics have been long-explored and matured to have yielded commercial application markets, our research potentially benefits those existing markets: e.g. content based image retrieval, photo album management and surveillance systems, by recognising intrinsic object shapes. Human shape modeling in the proposed work itself is a useful output for man-machine interface such as avatar, when it explicitly yields shape reconstruction. It also helps human pose recognition, which was recently deployed for a commercial console game with a huge success, in an invariant manner to human shape changes.

The proposed research potentially spins out new applications, especially interesting for public health cares. Public heath, particularly obesity has doubled since 1980 and has been a top social issue in UK and the world. Systematic ways of monitoring and caring individual and public degree of overweight are highly demanding. An interesting new sub-topic of computer vision study, food recognition [49,57], is relevant to this issue. Owing to the widespread use of smart phones equipped with a camera, users can take photos of daily food to consume and the software automatically recognises food types and provides users with useful dietary information according to a pre-defined food calorie table. Various mobile health projects (called m- or e-health) have also been carried out for health services and information. The proposed system can automatically categorise human shapes in real-time and without users' attention or notice (no particular pose and camera view-point needed). The estimated shapes can lead to more direct measurements for obesity i.e. waist circumferences and weights. We intend to carry on our research in the context of vision-based health-care, by human shape recognition, food recognition and also their combination in future.

Our solution aims at recognising human individual shapes or shape categories. This further strengths existing biometric solutions (face, gait recognition) in an appropriate environment. A X-ray type scanner that produces "naked" images of passengers has been adopted for speeding up security checks at Manchester Airport. With the aid of the sensor, clothing effects are removed in the deployment of the proposed silhouette-based algorithm. The proposed algorithm can be combined with existing face or gait recognisers to improve the accuracy. The real-time solution does not need to save images for further use and quickly destroys images to lessen privacy issues.

Our proposed research is for generic object categories, not limited to human bodies. Animal tracking and monitoring especially for horse or big fish has been long studied. They need an non-invasive manner of monitoring without human observers and present satellite navigation tracking system is costly and requires specific sensors. The proposed vision-based solution would be useful to help and improve existing solutions by consideration of large shape and pose variations, and to detect animal sub-species by their shape categories in a reasonably constrained environment.

Our research and successful outcomes would positively influence security and health related policy-makers and government agencies. With adverse impacts of government funding cuts on UK national health services, technological development in mobile health-cares can help achieve the productivity and cost-efficiency savings necessary in both public and private healthcare sectors.

From the intended project activities, the P.I. would learn skills for project management and professional networking, and the R.A. the skills to carry out main research activities including efficient coding, management of different program versions and code sharing. Both P.I. and R.A. would be involved in and learn skills of interacting with a wide spectrum of audiences, and giving a clear description of the project goals and achievements for maximum dissemination of the results.

Publications

10 25 50
 
Description We have developed a set of theories and methods that fit a 3D deformable model (human body, hand, face, etc) to input images.
Exploitation Route They are useful to understand visual contents in 3D, their view points, articulation, and deformation as well as object classes.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description The developed system and program have been used for (1) Huawei industrial grant on 3D hand pose estimation, and (2) Samsung industrial grant on reinforcement learning for robot grasping. In these applications, the key findings are applied to address the articulated and highly deformable object recognition problems.
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Description EPSRC program grant (Face Matching for Automatic Identity Retrieval, Recognition, Verification and Management)
Amount £5,958,623 (GBP)
Funding ID EP/N007743/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2016 
End 12/2020
 
Description Industrial grant (human pose/activity detection)
Amount £82,000 (GBP)
Organisation Omron Corporation 
Sector Private
Country Japan
Start 10/2015 
End 09/2016
 
Title Unified Face Analysis Dataset 
Description A new benchmark to evaluate the joint performance on sub-problems of face 
Type Of Material Database/Collection of data 
Provided To Others? No  
Impact Will be helpful for other researchers in the field 
 
Description A joint paper with Prof. Cipolla's group in Univ. of Cambridge 
Organisation University of Cambridge
Country United Kingdom 
Sector Academic/University 
PI Contribution a new benchmark on 3D human body pose and action detection, a new framework to conduct pose and action estimation simultaneoulsy
Collaborator Contribution detailed formulations, implementations, experiments, writing-up
Impact T.H. Yu, T-K. Kim, and R. Cipolla, Unconstrained Monocular 3D Human Pose Estimation by Action Detection and Cross-modality Regression Forest, Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Portland, Oregon, USA, 2013.
Start Year 2012
 
Description Joint papers with Center for Research and Technology Hellas (CERTH), Thessaloniki, Greece 
Organisation Centre for Research and Technology Hellas (CERTH)
Country Greece 
Sector Academic/University 
PI Contribution novel ideas to perform active visual perception, comparative experiments
Collaborator Contribution detailed formulations of ideas, implementations, data work, experiments, writing-up
Impact A. Doumanoglou, T-K. Kim, X. Zhao, S. Malassiotis, Active Random Forests: An application to Autonomous Unfolding of Clothes, Proc. of European Conference on Computer Vision (ECCV), Zurich, Switzerland, 2014. A. Doumanoglou, A. Kargakos, T-K. Kim, S. Malassiotis, Autonomous Active Recognition and Unfolding of Clothes using Random Decision Forests and Probabilistic Planning, Proc. of IEEE Int. Conf. on Robotics and Automation (ICRA), Hong Kong, China, 2014 (KUKA best service robotics paper award).
Start Year 2012
 
Description Associate Editor of (Elsevier) Image and Vision Computing Journal 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I joined the editorial board of a premium international journal in the field of computer vision.
Year(s) Of Engagement Activity 2016,2017
 
Description Associate Editor of IPSJ Transactions on Computer Vision and Applications (CVA), 2016-2018 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I joined the editorial board of an international journal in the field of computer vision.
Year(s) Of Engagement Activity 2016,2017
 
Description BMVA Executive Committee member 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact I joined the BMVA Executive Committee for the period of Jan 2016-Dec 2018. The BMVA organises/supports various academic events and activities in the field of computer vision, and the BMVA executive committee meet to discuss regularly throughout each year.
Year(s) Of Engagement Activity 2016,2017
 
Description Demo, Imperial College Science Festival May 2016 (500+ visitors) 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Schools
Results and Impact We did a hands-on demo for (deformable and articulated object) pose estimation at Imperial College Science Festival, May 2016. Our demo attracted/received 500+ visitors.
Year(s) Of Engagement Activity 2016
 
Description General chair of BMVC17 in London, UK 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact We organised a top-tier premium computer vision conference in Imperial College. A record number of paper submissions, 650, and attendees, 500, were received, in the history of BMVC.
Year(s) Of Engagement Activity 2017
 
Description General co-chair of British Machine Vision Conference (BMVC), London, Sep 2017 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I co-organise this premium international conference on computer vision in London. About +350 participants and +600 high quality paper submissions are expected. The most renowned academic figures in the field are confirmed as keynotes and tutorial speakers for the event. The event this year is expected to be a unique monument in various aspects.
Year(s) Of Engagement Activity 2017
 
Description General co-chair of IEEE 2nd workshop on observing and understanding hands in action (HANDS, in conjunction with CVPR), Las Vegas, USA, Jun 2016 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I co-organised the international workshop on the topic of hand pose estimation in conjunction with CVPR 2016. About +55 PG students/academics/professionals attended this workshop, and presented 4 oral papers and 10 posters. We also had 5 invited talks from renowned experts on the topic.
Year(s) Of Engagement Activity 2016
 
Description General co-chair of IEEE 3rd workshop on observing and understanding hands in action (HANDS) and the 2017 Hands in the Million Challenge, (in conjunction with ICCV), Venice, Italy 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact The workshop was very well attended, effectively promoting the research topic to the audiences. The challenge organised in conjunction with the workshop was very successful, receiving about 20 participating inputs, and in turn the effort was published as a research article in CVPR 2018.
Year(s) Of Engagement Activity 2017
 
Description Guest Editor of (Elsevier) Pattern Recognition Letters Special Issue on Personalised and Context-sensitive Interfaces in the Wild, 2016. 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact I have served as a guest editor of a special issue in Pattern Recognition Letter journal.
Year(s) Of Engagement Activity 2016,2017
 
Description Invited lecture/lab, BMVA computer vision summer school, Swansea, UK 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact About +60 PG students attended this school, and I gave 1.5 hour lecture and 1.5 hour hands-on session, on random decision forest with deep learning. The participants expressed lots of interest on the topics and told they would use the learnt for their PG studies. The school reported very good feedback received from the students.
Year(s) Of Engagement Activity 2016
 
Description Invited talk at Deep learning summit, London, UK 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact About +300 engineers/entrepreneurs and postgraduate students attended for the summit, I gave an invited talk with a live demo on the topic of deep learning and random forest, for the applications on hand pose estimation and face recognition. The summit organisers reported very good feedback from audience on the event/topics.
Year(s) Of Engagement Activity 2016
 
Description Invited talk at Korean Conf. on Computer Vision 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact About +200 PG students and engineers from industry attended this event, where I gave an invited talk on the topic of deep learning and tree-structure algorithms, for hand (articulated object) pose and face (deformable object) recognition, and got lots of questions and discussions during the event.
Year(s) Of Engagement Activity 2016
 
Description Invited talk at Omron Corporation, Kyoto, Japan, 2013 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The talk is about our own contributions for challenging novel problems: real-time action, 3D posture recognition and object phenotype recognition, which are a part of the project. Randomised Decision Forests and tree-structured methods are proposed for real-time vision solutions.

increase in request in further involvement in the related research activities
Year(s) Of Engagement Activity 2013
 
Description Invited talk in Seminar Series of the Robotics Research Group at University of Oxford, UK 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Participants in your research and patient groups
Results and Impact The talk is about Randomised Decision Forest, an emerging technique in relevant fields, being highly successful for various real-time vision solutions. In this talk, Randomised Decision Forests and tree-structured methods are reviewed with comparative and insightful discussions, leading to our own contributions for challenging novel problems: real-time action and 3D posture recognition and object phenotype recognition, which we recently tackled at Imperial College London.

increase in request in further involvement in the related research activities
Year(s) Of Engagement Activity 2012
 
Description Keynote at Korea-Japan joint workshop on Frontiers of Computer Vision, Takayama, Japan 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact About +150 PG students and professors attended this conference, where I gave a keynote. This is a unique event for computer vision researchers especially to promote collaborations and knowledge-sharing between Korea and Japan. My talk on the topic of deep learning and tree structure algorithm sparked lots of questions and interests, the conference organisers reported increased attendance, and excellent feedback on my talk.
Year(s) Of Engagement Activity 2016
 
Description Keynote at Samsung AI Forum, Samsung Software R&D Center, Seoul, Korea 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact I gave an invited talk to AI forum organised by the Samsung group. More than 300 people attended the event, the talk was broadcasted to the whole Samsung group.
Year(s) Of Engagement Activity 2017
 
Description Lecture at BMVA Computer Vision Summer School, Swansea, UK 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Participants in your research and patient groups
Results and Impact attracted the participant phd students and practitioners into the topic

increase in requests about further involvement in related research/teaching activities
Year(s) Of Engagement Activity 2014
 
Description Live demo at Imperial College Science Festival 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact 6D robot vision demo, in the Robotics Forum stand, Imperial College Science Festival May 2015. About 20,000 visitors were received.
Year(s) Of Engagement Activity 2015