3D Intrinsic Shape Recognition Under Deformation and View Changes

Lead Research Organisation: Imperial College London

Department Name: Electrical and Electronic Engineering

Abstract

In the proposal, we tackle the novel visual recognition problem of 3D (three-dimensional) deformable object shape identities or categories. Images of 3D objects undergo large appearance changes due to different object poses (articulation or deformation) as well as camera view-points. We attempt to recognise objects from single images by their 3D shape identities (or intrinsic shapes) regardless of their present poses and camera view-points. Humans can perceive 3D shapes of objects from single images, provided that they have previously seen 3D shapes of similar other objects. The knowledge formerly learnt on 3D shapes is called 3D shape prior. A key idea for fulfilling the proposed task is to learn and exploit the shape priors for object recognition.

The proposed research is well-lined with and goes beyond important topics of computer vision. Whereas much work for view-point invariant object recognition is limited to rigid object classes with bountiful textures, we consider deformable object shapes. In a series of work in the field of single view reconstruction, promising results have been shown for human body shape reconstruction under pose variations. There has also been a notable latest success in 3D human pose recognition. On the top of these results, we go beyond to capture 3D intrinsic shape variations for object recognition. The intended outcomes would benefit the relevant academic fields and their existing markets, and would also lead to potential new applications such as automatic monitoring of public obesity and animal tracking.

Planned Impact

As the relevant academic topics have been long-explored and matured to have yielded commercial application markets, our research potentially benefits those existing markets: e.g. content based image retrieval, photo album management and surveillance systems, by recognising intrinsic object shapes. Human shape modeling in the proposed work itself is a useful output for man-machine interface such as avatar, when it explicitly yields shape reconstruction. It also helps human pose recognition, which was recently deployed for a commercial console game with a huge success, in an invariant manner to human shape changes.

The proposed research potentially spins out new applications, especially interesting for public health cares. Public heath, particularly obesity has doubled since 1980 and has been a top social issue in UK and the world. Systematic ways of monitoring and caring individual and public degree of overweight are highly demanding. An interesting new sub-topic of computer vision study, food recognition [49,57], is relevant to this issue. Owing to the widespread use of smart phones equipped with a camera, users can take photos of daily food to consume and the software automatically recognises food types and provides users with useful dietary information according to a pre-defined food calorie table. Various mobile health projects (called m- or e-health) have also been carried out for health services and information. The proposed system can automatically categorise human shapes in real-time and without users' attention or notice (no particular pose and camera view-point needed). The estimated shapes can lead to more direct measurements for obesity i.e. waist circumferences and weights. We intend to carry on our research in the context of vision-based health-care, by human shape recognition, food recognition and also their combination in future.

Our solution aims at recognising human individual shapes or shape categories. This further strengths existing biometric solutions (face, gait recognition) in an appropriate environment. A X-ray type scanner that produces "naked" images of passengers has been adopted for speeding up security checks at Manchester Airport. With the aid of the sensor, clothing effects are removed in the deployment of the proposed silhouette-based algorithm. The proposed algorithm can be combined with existing face or gait recognisers to improve the accuracy. The real-time solution does not need to save images for further use and quickly destroys images to lessen privacy issues.

Our proposed research is for generic object categories, not limited to human bodies. Animal tracking and monitoring especially for horse or big fish has been long studied. They need an non-invasive manner of monitoring without human observers and present satellite navigation tracking system is costly and requires specific sensors. The proposed vision-based solution would be useful to help and improve existing solutions by consideration of large shape and pose variations, and to detect animal sub-species by their shape categories in a reasonably constrained environment.

Our research and successful outcomes would positively influence security and health related policy-makers and government agencies. With adverse impacts of government funding cuts on UK national health services, technological development in mobile health-cares can help achieve the productivity and cost-efficiency savings necessary in both public and private healthcare sectors.

From the intended project activities, the P.I. would learn skills for project management and professional networking, and the R.A. the skills to carry out main research activities including efficient coding, management of different program versions and code sharing. Both P.I. and R.A. would be involved in and learn skills of interacting with a wide spectrum of audiences, and giving a clear description of the project goals and achievements for maximum dissemination of the results.

Funded Value:

£97,751

Funded Period:

Jun 12 - Aug 14

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/J012106/1

Principal Investigator:

Tae-Kyun Kim

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Image & Vision Computing (100%)

Organisations

People	ORCID iD
Tae-Kyun Kim (Principal Investigator)

Publications

Author Name Title Publication Date Published

|< < 1 2 3 > >|

10 25 50

Luo W (2019) Trajectories as Topics: Multi-Object Tracking by Topic Discovery. in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

Tang D (2019) Opening the Black Box: Hierarchical Sampling Optimization for Hand Pose Estimation. in IEEE transactions on pattern analysis and machine intelligence

S. Yuan (2018) 3D Hand Pose Estimation: From Current Achievements to Future Goals

S. Baek (2018) Augmented skeleton space transfer for depth-based hand pose estimation

Yuan S. (2017) Big Hand 2.2M Benchmark: Hand Pose Data Set and State of the Art Analysis

Tang D (2017) Latent Regression Forest: Structured Estimation of 3D Hand Poses in IEEE Transactions on Pattern Analysis and Machine Intelligence

Shi Z. (2017) Learning and Refining of Privileged Information-based RNNs for Action Recognition from Depth Sequences

Garcia-Hernando G. (2017) Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition

Doumanoglou A (2016) Folding Clothes Autonomously: A Complete Pipeline in IEEE Transactions on Robotics

Xiong C (2016) Convolutional Fusion Network for Face Verification in the Wild in IEEE Transactions on Circuits and Systems for Video Technology

Key Findings
Impact Summary
Further Funding
Research Databases and Models
Collaboration
Engagement Activities


Description	We have developed a set of theories and methods that fit a 3D deformable model (human body, hand, face, etc) to input images.
Exploitation Route	They are useful to understand visual contents in 3D, their view points, articulation, and deformation as well as object classes.
Sectors	Digital/Communication/Information Technologies (including Software)


Description	The developed system and program have been used for (1) Huawei industrial grant on 3D hand pose estimation, and (2) Samsung industrial grant on reinforcement learning for robot grasping. In these applications, the key findings are applied to address the articulated and highly deformable object recognition problems.
Sector	Digital/Communication/Information Technologies (including Software)
Impact Types	Economic


Description	EPSRC program grant (Face Matching for Automatic Identity Retrieval, Recognition, Verification and Management)
Amount	£5,958,623 (GBP)
Funding ID	EP/N007743/1
Organisation	Engineering and Physical Sciences Research Council (EPSRC)
Sector	Public
Country	United Kingdom
Start	01/2016
End	12/2020


Description	Industrial grant (human pose/activity detection)
Amount	£82,000 (GBP)
Organisation	Omron Corporation
Sector	Private
Country	Japan
Start	09/2015
End	09/2016


Title	Unified Face Analysis Dataset
Description	A new benchmark to evaluate the joint performance on sub-problems of face
Type Of Material	Database/Collection of data
Provided To Others?	No
Impact	Will be helpful for other researchers in the field


Description	A joint paper with Prof. Cipolla's group in Univ. of Cambridge
Organisation	University of Cambridge
Country	United Kingdom
Sector	Academic/University
PI Contribution	a new benchmark on 3D human body pose and action detection, a new framework to conduct pose and action estimation simultaneoulsy
Collaborator Contribution	detailed formulations, implementations, experiments, writing-up
Impact	T.H. Yu, T-K. Kim, and R. Cipolla, Unconstrained Monocular 3D Human Pose Estimation by Action Detection and Cross-modality Regression Forest, Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Portland, Oregon, USA, 2013.
Start Year	2012


Description	Joint papers with Center for Research and Technology Hellas (CERTH), Thessaloniki, Greece
Organisation	Centre for Research and Technology Hellas (CERTH)
Country	Greece
Sector	Academic/University
PI Contribution	novel ideas to perform active visual perception, comparative experiments
Collaborator Contribution	detailed formulations of ideas, implementations, data work, experiments, writing-up
Impact	A. Doumanoglou, T-K. Kim, X. Zhao, S. Malassiotis, Active Random Forests: An application to Autonomous Unfolding of Clothes, Proc. of European Conference on Computer Vision (ECCV), Zurich, Switzerland, 2014. A. Doumanoglou, A. Kargakos, T-K. Kim, S. Malassiotis, Autonomous Active Recognition and Unfolding of Clothes using Random Decision Forests and Probabilistic Planning, Proc. of IEEE Int. Conf. on Robotics and Automation (ICRA), Hong Kong, China, 2014 (KUKA best service robotics paper award).
Start Year	2012


Description	Associate Editor of (Elsevier) Image and Vision Computing Journal
Form Of Engagement Activity	A magazine, newsletter or online publication
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	I joined the editorial board of a premium international journal in the field of computer vision.
Year(s) Of Engagement Activity	2016,2017


Description	Associate Editor of IPSJ Transactions on Computer Vision and Applications (CVA), 2016-2018
Form Of Engagement Activity	A magazine, newsletter or online publication
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	I joined the editorial board of an international journal in the field of computer vision.
Year(s) Of Engagement Activity	2016,2017


Description	BMVA Executive Committee member
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	I joined the BMVA Executive Committee for the period of Jan 2016-Dec 2018. The BMVA organises/supports various academic events and activities in the field of computer vision, and the BMVA executive committee meet to discuss regularly throughout each year.
Year(s) Of Engagement Activity	2016,2017


Description	Demo, Imperial College Science Festival May 2016 (500+ visitors)
Form Of Engagement Activity	Participation in an open day or visit at my research institution
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Schools
Results and Impact	We did a hands-on demo for (deformable and articulated object) pose estimation at Imperial College Science Festival, May 2016. Our demo attracted/received 500+ visitors.
Year(s) Of Engagement Activity	2016


Description	General chair of BMVC17 in London, UK
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	We organised a top-tier premium computer vision conference in Imperial College. A record number of paper submissions, 650, and attendees, 500, were received, in the history of BMVC.
Year(s) Of Engagement Activity	2017


Description	General co-chair of British Machine Vision Conference (BMVC), London, Sep 2017
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	I co-organise this premium international conference on computer vision in London. About +350 participants and +600 high quality paper submissions are expected. The most renowned academic figures in the field are confirmed as keynotes and tutorial speakers for the event. The event this year is expected to be a unique monument in various aspects.
Year(s) Of Engagement Activity	2017


Description	General co-chair of IEEE 2nd workshop on observing and understanding hands in action (HANDS, in conjunction with CVPR), Las Vegas, USA, Jun 2016
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	I co-organised the international workshop on the topic of hand pose estimation in conjunction with CVPR 2016. About +55 PG students/academics/professionals attended this workshop, and presented 4 oral papers and 10 posters. We also had 5 invited talks from renowned experts on the topic.
Year(s) Of Engagement Activity	2016


Description	General co-chair of IEEE 3rd workshop on observing and understanding hands in action (HANDS) and the 2017 Hands in the Million Challenge, (in conjunction with ICCV), Venice, Italy
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	The workshop was very well attended, effectively promoting the research topic to the audiences. The challenge organised in conjunction with the workshop was very successful, receiving about 20 participating inputs, and in turn the effort was published as a research article in CVPR 2018.
Year(s) Of Engagement Activity	2017


Description	Guest Editor of (Elsevier) Pattern Recognition Letters Special Issue on Personalised and Context-sensitive Interfaces in the Wild, 2016.
Form Of Engagement Activity	A magazine, newsletter or online publication
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	I have served as a guest editor of a special issue in Pattern Recognition Letter journal.
Year(s) Of Engagement Activity	2016,2017


Description	Invited lecture/lab, BMVA computer vision summer school, Swansea, UK
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	About +60 PG students attended this school, and I gave 1.5 hour lecture and 1.5 hour hands-on session, on random decision forest with deep learning. The participants expressed lots of interest on the topics and told they would use the learnt for their PG studies. The school reported very good feedback received from the students.
Year(s) Of Engagement Activity	2016


Description	Invited talk at Deep learning summit, London, UK
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	About +300 engineers/entrepreneurs and postgraduate students attended for the summit, I gave an invited talk with a live demo on the topic of deep learning and random forest, for the applications on hand pose estimation and face recognition. The summit organisers reported very good feedback from audience on the event/topics.
Year(s) Of Engagement Activity	2016


Description	Invited talk at Korean Conf. on Computer Vision
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	About +200 PG students and engineers from industry attended this event, where I gave an invited talk on the topic of deep learning and tree-structure algorithms, for hand (articulated object) pose and face (deformable object) recognition, and got lots of questions and discussions during the event.
Year(s) Of Engagement Activity	2016


Description	Invited talk at Omron Corporation, Kyoto, Japan, 2013
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	The talk is about our own contributions for challenging novel problems: real-time action, 3D posture recognition and object phenotype recognition, which are a part of the project. Randomised Decision Forests and tree-structured methods are proposed for real-time vision solutions. increase in request in further involvement in the related research activities
Year(s) Of Engagement Activity	2013


Description	Invited talk in Seminar Series of the Robotics Research Group at University of Oxford, UK
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Participants in your research and patient groups
Results and Impact	The talk is about Randomised Decision Forest, an emerging technique in relevant fields, being highly successful for various real-time vision solutions. In this talk, Randomised Decision Forests and tree-structured methods are reviewed with comparative and insightful discussions, leading to our own contributions for challenging novel problems: real-time action and 3D posture recognition and object phenotype recognition, which we recently tackled at Imperial College London. increase in request in further involvement in the related research activities
Year(s) Of Engagement Activity	2012


Description	Keynote at Korea-Japan joint workshop on Frontiers of Computer Vision, Takayama, Japan
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	About +150 PG students and professors attended this conference, where I gave a keynote. This is a unique event for computer vision researchers especially to promote collaborations and knowledge-sharing between Korea and Japan. My talk on the topic of deep learning and tree structure algorithm sparked lots of questions and interests, the conference organisers reported increased attendance, and excellent feedback on my talk.
Year(s) Of Engagement Activity	2016


Description	Keynote at Samsung AI Forum, Samsung Software R&D Center, Seoul, Korea
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	I gave an invited talk to AI forum organised by the Samsung group. More than 300 people attended the event, the talk was broadcasted to the whole Samsung group.
Year(s) Of Engagement Activity	2017


Description	Lecture at BMVA Computer Vision Summer School, Swansea, UK
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Participants in your research and patient groups
Results and Impact	attracted the participant phd students and practitioners into the topic increase in requests about further involvement in related research/teaching activities
Year(s) Of Engagement Activity	2014


Description	Live demo at Imperial College Science Festival
Form Of Engagement Activity	Participation in an open day or visit at my research institution
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Public/other audiences
Results and Impact	6D robot vision demo, in the Robotics Forum stand, Imperial College Science Festival May 2015. About 20,000 visitors were received.
Year(s) Of Engagement Activity	2015

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications