3D Intrinsic Shape Recognition Under Deformation and View Changes
Lead Research Organisation:
Imperial College London
Department Name: Electrical and Electronic Engineering
Abstract
In the proposal, we tackle the novel visual recognition problem of 3D (three-dimensional) deformable object shape identities or categories. Images of 3D objects undergo large appearance changes due to different object poses (articulation or deformation) as well as camera view-points. We attempt to recognise objects from single images by their 3D shape identities (or intrinsic shapes) regardless of their present poses and camera view-points. Humans can perceive 3D shapes of objects from single images, provided that they have previously seen 3D shapes of similar other objects. The knowledge formerly learnt on 3D shapes is called 3D shape prior. A key idea for fulfilling the proposed task is to learn and exploit the shape priors for object recognition.
The proposed research is well-lined with and goes beyond important topics of computer vision. Whereas much work for view-point invariant object recognition is limited to rigid object classes with bountiful textures, we consider deformable object shapes. In a series of work in the field of single view reconstruction, promising results have been shown for human body shape reconstruction under pose variations. There has also been a notable latest success in 3D human pose recognition. On the top of these results, we go beyond to capture 3D intrinsic shape variations for object recognition. The intended outcomes would benefit the relevant academic fields and their existing markets, and would also lead to potential new applications such as automatic monitoring of public obesity and animal tracking.
The proposed research is well-lined with and goes beyond important topics of computer vision. Whereas much work for view-point invariant object recognition is limited to rigid object classes with bountiful textures, we consider deformable object shapes. In a series of work in the field of single view reconstruction, promising results have been shown for human body shape reconstruction under pose variations. There has also been a notable latest success in 3D human pose recognition. On the top of these results, we go beyond to capture 3D intrinsic shape variations for object recognition. The intended outcomes would benefit the relevant academic fields and their existing markets, and would also lead to potential new applications such as automatic monitoring of public obesity and animal tracking.
Planned Impact
As the relevant academic topics have been long-explored and matured to have yielded commercial application markets, our research potentially benefits those existing markets: e.g. content based image retrieval, photo album management and surveillance systems, by recognising intrinsic object shapes. Human shape modeling in the proposed work itself is a useful output for man-machine interface such as avatar, when it explicitly yields shape reconstruction. It also helps human pose recognition, which was recently deployed for a commercial console game with a huge success, in an invariant manner to human shape changes.
The proposed research potentially spins out new applications, especially interesting for public health cares. Public heath, particularly obesity has doubled since 1980 and has been a top social issue in UK and the world. Systematic ways of monitoring and caring individual and public degree of overweight are highly demanding. An interesting new sub-topic of computer vision study, food recognition [49,57], is relevant to this issue. Owing to the widespread use of smart phones equipped with a camera, users can take photos of daily food to consume and the software automatically recognises food types and provides users with useful dietary information according to a pre-defined food calorie table. Various mobile health projects (called m- or e-health) have also been carried out for health services and information. The proposed system can automatically categorise human shapes in real-time and without users' attention or notice (no particular pose and camera view-point needed). The estimated shapes can lead to more direct measurements for obesity i.e. waist circumferences and weights. We intend to carry on our research in the context of vision-based health-care, by human shape recognition, food recognition and also their combination in future.
Our solution aims at recognising human individual shapes or shape categories. This further strengths existing biometric solutions (face, gait recognition) in an appropriate environment. A X-ray type scanner that produces "naked" images of passengers has been adopted for speeding up security checks at Manchester Airport. With the aid of the sensor, clothing effects are removed in the deployment of the proposed silhouette-based algorithm. The proposed algorithm can be combined with existing face or gait recognisers to improve the accuracy. The real-time solution does not need to save images for further use and quickly destroys images to lessen privacy issues.
Our proposed research is for generic object categories, not limited to human bodies. Animal tracking and monitoring especially for horse or big fish has been long studied. They need an non-invasive manner of monitoring without human observers and present satellite navigation tracking system is costly and requires specific sensors. The proposed vision-based solution would be useful to help and improve existing solutions by consideration of large shape and pose variations, and to detect animal sub-species by their shape categories in a reasonably constrained environment.
Our research and successful outcomes would positively influence security and health related policy-makers and government agencies. With adverse impacts of government funding cuts on UK national health services, technological development in mobile health-cares can help achieve the productivity and cost-efficiency savings necessary in both public and private healthcare sectors.
From the intended project activities, the P.I. would learn skills for project management and professional networking, and the R.A. the skills to carry out main research activities including efficient coding, management of different program versions and code sharing. Both P.I. and R.A. would be involved in and learn skills of interacting with a wide spectrum of audiences, and giving a clear description of the project goals and achievements for maximum dissemination of the results.
The proposed research potentially spins out new applications, especially interesting for public health cares. Public heath, particularly obesity has doubled since 1980 and has been a top social issue in UK and the world. Systematic ways of monitoring and caring individual and public degree of overweight are highly demanding. An interesting new sub-topic of computer vision study, food recognition [49,57], is relevant to this issue. Owing to the widespread use of smart phones equipped with a camera, users can take photos of daily food to consume and the software automatically recognises food types and provides users with useful dietary information according to a pre-defined food calorie table. Various mobile health projects (called m- or e-health) have also been carried out for health services and information. The proposed system can automatically categorise human shapes in real-time and without users' attention or notice (no particular pose and camera view-point needed). The estimated shapes can lead to more direct measurements for obesity i.e. waist circumferences and weights. We intend to carry on our research in the context of vision-based health-care, by human shape recognition, food recognition and also their combination in future.
Our solution aims at recognising human individual shapes or shape categories. This further strengths existing biometric solutions (face, gait recognition) in an appropriate environment. A X-ray type scanner that produces "naked" images of passengers has been adopted for speeding up security checks at Manchester Airport. With the aid of the sensor, clothing effects are removed in the deployment of the proposed silhouette-based algorithm. The proposed algorithm can be combined with existing face or gait recognisers to improve the accuracy. The real-time solution does not need to save images for further use and quickly destroys images to lessen privacy issues.
Our proposed research is for generic object categories, not limited to human bodies. Animal tracking and monitoring especially for horse or big fish has been long studied. They need an non-invasive manner of monitoring without human observers and present satellite navigation tracking system is costly and requires specific sensors. The proposed vision-based solution would be useful to help and improve existing solutions by consideration of large shape and pose variations, and to detect animal sub-species by their shape categories in a reasonably constrained environment.
Our research and successful outcomes would positively influence security and health related policy-makers and government agencies. With adverse impacts of government funding cuts on UK national health services, technological development in mobile health-cares can help achieve the productivity and cost-efficiency savings necessary in both public and private healthcare sectors.
From the intended project activities, the P.I. would learn skills for project management and professional networking, and the R.A. the skills to carry out main research activities including efficient coding, management of different program versions and code sharing. Both P.I. and R.A. would be involved in and learn skills of interacting with a wide spectrum of audiences, and giving a clear description of the project goals and achievements for maximum dissemination of the results.
People |
ORCID iD |
Tae-Kyun Kim (Principal Investigator) |
Publications
Luo W
(2019)
Trajectories as Topics: Multi-Object Tracking by Topic Discovery.
in IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Tang D
(2019)
Opening the Black Box: Hierarchical Sampling Optimization for Hand Pose Estimation.
in IEEE transactions on pattern analysis and machine intelligence
Tang D
(2017)
Latent Regression Forest: Structured Estimation of 3D Hand Poses
in IEEE Transactions on Pattern Analysis and Machine Intelligence
Garcia-Hernando G.
(2017)
Transition Forests: Learning Discriminative Temporal Transitions for Action Recognition
Doumanoglou A
(2016)
Folding Clothes Autonomously: A Complete Pipeline
in IEEE Transactions on Robotics
Xiong C
(2016)
Convolutional Fusion Network for Face Verification in the Wild
in IEEE Transactions on Circuits and Systems for Video Technology
Description | We have developed a set of theories and methods that fit a 3D deformable model (human body, hand, face, etc) to input images. |
Exploitation Route | They are useful to understand visual contents in 3D, their view points, articulation, and deformation as well as object classes. |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | The developed system and program have been used for (1) Huawei industrial grant on 3D hand pose estimation, and (2) Samsung industrial grant on reinforcement learning for robot grasping. In these applications, the key findings are applied to address the articulated and highly deformable object recognition problems. |
Sector | Digital/Communication/Information Technologies (including Software) |
Impact Types | Economic |
Description | EPSRC program grant (Face Matching for Automatic Identity Retrieval, Recognition, Verification and Management) |
Amount | £5,958,623 (GBP) |
Funding ID | EP/N007743/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2016 |
End | 12/2020 |
Description | Industrial grant (human pose/activity detection) |
Amount | £82,000 (GBP) |
Organisation | Omron Corporation |
Sector | Private |
Country | Japan |
Start | 09/2015 |
End | 09/2016 |
Title | Unified Face Analysis Dataset |
Description | A new benchmark to evaluate the joint performance on sub-problems of face |
Type Of Material | Database/Collection of data |
Provided To Others? | No |
Impact | Will be helpful for other researchers in the field |
Description | A joint paper with Prof. Cipolla's group in Univ. of Cambridge |
Organisation | University of Cambridge |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | a new benchmark on 3D human body pose and action detection, a new framework to conduct pose and action estimation simultaneoulsy |
Collaborator Contribution | detailed formulations, implementations, experiments, writing-up |
Impact | T.H. Yu, T-K. Kim, and R. Cipolla, Unconstrained Monocular 3D Human Pose Estimation by Action Detection and Cross-modality Regression Forest, Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Portland, Oregon, USA, 2013. |
Start Year | 2012 |
Description | Joint papers with Center for Research and Technology Hellas (CERTH), Thessaloniki, Greece |
Organisation | Centre for Research and Technology Hellas (CERTH) |
Country | Greece |
Sector | Academic/University |
PI Contribution | novel ideas to perform active visual perception, comparative experiments |
Collaborator Contribution | detailed formulations of ideas, implementations, data work, experiments, writing-up |
Impact | A. Doumanoglou, T-K. Kim, X. Zhao, S. Malassiotis, Active Random Forests: An application to Autonomous Unfolding of Clothes, Proc. of European Conference on Computer Vision (ECCV), Zurich, Switzerland, 2014. A. Doumanoglou, A. Kargakos, T-K. Kim, S. Malassiotis, Autonomous Active Recognition and Unfolding of Clothes using Random Decision Forests and Probabilistic Planning, Proc. of IEEE Int. Conf. on Robotics and Automation (ICRA), Hong Kong, China, 2014 (KUKA best service robotics paper award). |
Start Year | 2012 |
Description | Associate Editor of (Elsevier) Image and Vision Computing Journal |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | I joined the editorial board of a premium international journal in the field of computer vision. |
Year(s) Of Engagement Activity | 2016,2017 |
Description | Associate Editor of IPSJ Transactions on Computer Vision and Applications (CVA), 2016-2018 |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | I joined the editorial board of an international journal in the field of computer vision. |
Year(s) Of Engagement Activity | 2016,2017 |
Description | BMVA Executive Committee member |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | I joined the BMVA Executive Committee for the period of Jan 2016-Dec 2018. The BMVA organises/supports various academic events and activities in the field of computer vision, and the BMVA executive committee meet to discuss regularly throughout each year. |
Year(s) Of Engagement Activity | 2016,2017 |
Description | Demo, Imperial College Science Festival May 2016 (500+ visitors) |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Schools |
Results and Impact | We did a hands-on demo for (deformable and articulated object) pose estimation at Imperial College Science Festival, May 2016. Our demo attracted/received 500+ visitors. |
Year(s) Of Engagement Activity | 2016 |
Description | General chair of BMVC17 in London, UK |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | We organised a top-tier premium computer vision conference in Imperial College. A record number of paper submissions, 650, and attendees, 500, were received, in the history of BMVC. |
Year(s) Of Engagement Activity | 2017 |
Description | General co-chair of British Machine Vision Conference (BMVC), London, Sep 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | I co-organise this premium international conference on computer vision in London. About +350 participants and +600 high quality paper submissions are expected. The most renowned academic figures in the field are confirmed as keynotes and tutorial speakers for the event. The event this year is expected to be a unique monument in various aspects. |
Year(s) Of Engagement Activity | 2017 |
Description | General co-chair of IEEE 2nd workshop on observing and understanding hands in action (HANDS, in conjunction with CVPR), Las Vegas, USA, Jun 2016 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | I co-organised the international workshop on the topic of hand pose estimation in conjunction with CVPR 2016. About +55 PG students/academics/professionals attended this workshop, and presented 4 oral papers and 10 posters. We also had 5 invited talks from renowned experts on the topic. |
Year(s) Of Engagement Activity | 2016 |
Description | General co-chair of IEEE 3rd workshop on observing and understanding hands in action (HANDS) and the 2017 Hands in the Million Challenge, (in conjunction with ICCV), Venice, Italy |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | The workshop was very well attended, effectively promoting the research topic to the audiences. The challenge organised in conjunction with the workshop was very successful, receiving about 20 participating inputs, and in turn the effort was published as a research article in CVPR 2018. |
Year(s) Of Engagement Activity | 2017 |
Description | Guest Editor of (Elsevier) Pattern Recognition Letters Special Issue on Personalised and Context-sensitive Interfaces in the Wild, 2016. |
Form Of Engagement Activity | A magazine, newsletter or online publication |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | I have served as a guest editor of a special issue in Pattern Recognition Letter journal. |
Year(s) Of Engagement Activity | 2016,2017 |
Description | Invited lecture/lab, BMVA computer vision summer school, Swansea, UK |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | About +60 PG students attended this school, and I gave 1.5 hour lecture and 1.5 hour hands-on session, on random decision forest with deep learning. The participants expressed lots of interest on the topics and told they would use the learnt for their PG studies. The school reported very good feedback received from the students. |
Year(s) Of Engagement Activity | 2016 |
Description | Invited talk at Deep learning summit, London, UK |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | About +300 engineers/entrepreneurs and postgraduate students attended for the summit, I gave an invited talk with a live demo on the topic of deep learning and random forest, for the applications on hand pose estimation and face recognition. The summit organisers reported very good feedback from audience on the event/topics. |
Year(s) Of Engagement Activity | 2016 |
Description | Invited talk at Korean Conf. on Computer Vision |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | About +200 PG students and engineers from industry attended this event, where I gave an invited talk on the topic of deep learning and tree-structure algorithms, for hand (articulated object) pose and face (deformable object) recognition, and got lots of questions and discussions during the event. |
Year(s) Of Engagement Activity | 2016 |
Description | Invited talk at Omron Corporation, Kyoto, Japan, 2013 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The talk is about our own contributions for challenging novel problems: real-time action, 3D posture recognition and object phenotype recognition, which are a part of the project. Randomised Decision Forests and tree-structured methods are proposed for real-time vision solutions. increase in request in further involvement in the related research activities |
Year(s) Of Engagement Activity | 2013 |
Description | Invited talk in Seminar Series of the Robotics Research Group at University of Oxford, UK |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Participants in your research and patient groups |
Results and Impact | The talk is about Randomised Decision Forest, an emerging technique in relevant fields, being highly successful for various real-time vision solutions. In this talk, Randomised Decision Forests and tree-structured methods are reviewed with comparative and insightful discussions, leading to our own contributions for challenging novel problems: real-time action and 3D posture recognition and object phenotype recognition, which we recently tackled at Imperial College London. increase in request in further involvement in the related research activities |
Year(s) Of Engagement Activity | 2012 |
Description | Keynote at Korea-Japan joint workshop on Frontiers of Computer Vision, Takayama, Japan |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | About +150 PG students and professors attended this conference, where I gave a keynote. This is a unique event for computer vision researchers especially to promote collaborations and knowledge-sharing between Korea and Japan. My talk on the topic of deep learning and tree structure algorithm sparked lots of questions and interests, the conference organisers reported increased attendance, and excellent feedback on my talk. |
Year(s) Of Engagement Activity | 2016 |
Description | Keynote at Samsung AI Forum, Samsung Software R&D Center, Seoul, Korea |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | I gave an invited talk to AI forum organised by the Samsung group. More than 300 people attended the event, the talk was broadcasted to the whole Samsung group. |
Year(s) Of Engagement Activity | 2017 |
Description | Lecture at BMVA Computer Vision Summer School, Swansea, UK |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Participants in your research and patient groups |
Results and Impact | attracted the participant phd students and practitioners into the topic increase in requests about further involvement in related research/teaching activities |
Year(s) Of Engagement Activity | 2014 |
Description | Live demo at Imperial College Science Festival |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | 6D robot vision demo, in the Robotics Forum stand, Imperial College Science Festival May 2015. About 20,000 visitors were received. |
Year(s) Of Engagement Activity | 2015 |