Personalized Exploration of Imagery Database

Lead Research Organisation: University of Bath
Department Name: Computer Science

Abstract

"I want to see jackets which are stylish, but not too fancy. Say, 70% stylish."

This project aims to develop new techniques which can significantly improve data browsing experience in online shopping, dating, media recommendations, and many other applications.

Two very common ways to explore large collections of imagery items, for instance, in online shopping, are to browse a hierarchy of items and to search with textual keywords. The returned results are browsed in lists, typically ordered by popularity. However, popularity is defined across all users as one homogeneous peoples, and users cannot sort by their own subjective criteria, e.g., by their own personal `style' for clothes; What is `stylish' to one person will be passe to another. Furthermore, there is no way to place items on a continuous scale, where the criteria amount for each item is known, e.g., how stylish a particular piece of clothing is to a user.

Our goal is to develop new techniques which enable users to organize and explore imagery data based on their own subjective criteria at a high semantic level. This is a challenging problem: Many criteria are hard to quantify and a user may not even be able to articulate the criteria.
We face this challenge by observing that even though users may not be able to specify their criteria quantitatively, or even fully describe them, they are still able to communicate their own notions by providing examples, e.g., "this shoe is cooler than that one". Our goal is to build an algorithm that arranges a large corpus of visual data according to these examples. Once built, the arranged data can be browsed with an interface that exploits the learned criteria to navigate the continuous scale.

The key contributions of the proposed research will include 1) exploring different modes of user interaction and elaborate on reflecting the resulting knowledge to 2) a new algorithm that, by breaking the limitations of existing approaches, effectively and efficiently learns from user-provided examples and thereby makes personalized data exploration realistic.

Planned Impact

This project aims to develop new techniques which can significantly improve data browsing experience by enabling users to organize data collections based on their own subjective, semantic-level criteria. If successful, these techniques can be directly used in many applications that use/require data exploration. In particular, online shopping will be the biggest beneficiary of this research. The UK is one of the largest and ever growing markets in online shopping: As of November 2013, online shopping increased 10% over the year 2012 and revenues reached a monthly record of £10.1 billion. Specific application scenarios include
1) Finding the perfect chair for a user's room from thousands of possibilities across different styles, by ranking a small subset of chairs by preference.
2) Finding a tasty wine (in terms of personal preference) by trying a small number of different wines: Even novice users could easily establish their shopping portfolio, without having to gain knowledge of domain-specific keywords such as `Tannin' and `Tartaric Acid'.
This research will therefore, contribute to qualitative and quantitative growth of the online shopping market in the UK by attracting users with a significantly improved experience.

Online shopping is only an example of many data browsing applications. Additional application examples are
- Online dating (£170 million market in the UK): Attractiveness is personal-- ranking a small subset of people would help to tailor the personal matches you received by a personal appearance attractiveness scale.
- Media recommendation (e.g., Netflix, iTunes, Kindle): Ranking a few films, albums, or books in an online library to quickly organize the entire collection by your preference.

Our strategy for realizing such a browsing system is to make advances in machine learning, computer vision, and HCI. In particular, one of key technical contributions of this project will be an improved algorithm for semi-supervised learning. Since semi-supervised learning is nowadays extensively used in diverse areas including data mining, social networks analysis, robotics, and genetics, in the long-term, this project will impact on a much broader range of economic and academic activities which may benefit from these techniques.

Furthermore, our techniques will have a societal impact by helping people save time: if successful, users would no longer have to spend hours hunting for just the right item. Users could sort by their particular criteria, and have a good chance of finding it within a small amount of time. Collectively, this saves people a lot of time, and makes the shopping experience or more generally, the data browsing experience, much more pleasant.
 
Description The goal of this project is to develop new techniques which enable users to organize and explore imagery data based on their own subjective criteria at a high semantic level.

We have developed a set of machine learning algorithms to allow users to communicate their own criteria without having to know how that criteria might be formed or described at the data level [1,3,5,6,7].

One challenge in approaching this problem is the lack of ground-truth labels on which machine learning algorithms can be trained since users would not want to spend hours to provide labels. Our approach is to leverage the machine learning process from a set of labeled data points with a large set of unlabeled data points (i.e., data instances that are not explicitly labeled by users). It has been recently found that the well-established graph Laplacian-based semi-supervised algorithms can overfit to data, i.e., we obtain a machine which explains the training data perfectly but cannot generalize to new data.

We developed new semi-supervised learning algorithms that circumvent this problem and enable stable generalization of given labels to the entire unlabeled datasets [3,5,6,7]. Furthermore, we applied these algorithms and the principle of user-guided machine learning to developing
1) an interactive system that enables users to personalize the character controller for synthesizing animations in video games and animations [8].
2) a new 3D shape data exploration framework that enables users define their own distance measures facilitating semantics-level data retrieval [4].

and 2) to facilitate the sharing of knowledge gained from individual tasks across different data organization problems and users [2]

Another key finding of this project is a new algorithmic framework that facilitates the sharing of knowledge gained from individual tasks across different data organization problems and users [2]: Many real-world data organization problems involve learning several tasks exhibiting mutual dependence, e.g., the task of retrieving `smart' cars in a database is correlated with searching `sporty' cars. However, existing attempts to `combine' such multiple tasks are limited in that they require known mathematical forms of individual task executors and therefore, they are not directly applicable to aggregating decisions from pre-compiled software libraries or human evaluations (with unknown mathematical forms). We developed a new form-independent framework that automatically discovers latent dependence across multiple heterogeneous tasks and benefits from combining them without requiring access to their mathematical forms [2]. This significantly broadens the application spectrum of multiple task combination approaches.

Publications
[1] J. Tompkin, K. I. Kim, H. Pfister, and C. Theobalt,
Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking,
Proc. BMVC, 2017.

[2] K. I. Kim, J. Tompkin, and C Richardt,
Predictor Combination at Test Time,
Proc. ICCV, 2017.

[3] K. I. Kim,
Semi-supervised Learning based on Joint Diffusion of Graph Functions and Laplacians,
Proc. ECCV, 2016.

[4] K. Dev, K. I. Kim, N. Villar, and M. Lau,
Improving style similarity metrics of 3D shapes,
Proc. Graphics Interface, 2016.

[5] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt,
Semi-supervised learning with explicit relationship regularization,
Proc. CVPR, 2015.

[6] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt,
Local high-order regularization on data manifolds,
Proc. CVPR, 2015.

[7] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt,
Context-guided diffusion for label propagation on graphs,
Proc. ICCV, 2015.

[8] H. Rhodin, J. Tompkin, K. I. Kim, E. de Aguiar, H.-P. Seidel, and C. Theobalt,
Generalizing wave gestures from sparse examples for real-time character control,
ACM Trans. Graphics (Proc. SIGGRAPH Asia), 2015.
Exploitation Route Our new semi-supervised learning algorithms can be directly applied to various regression, classification, and ranking problems. The research outcomes have been disseminated through high-quality scientific conferences, journals, and open source software packages:
-------
https://people.mpi-inf.mpg.de/~kkim/hreg/index.html
https://people.mpi-inf.mpg.de/~kkim/relreg/index.html
https://people.mpi-inf.mpg.de/~kkim/diff/index.html
-------
Our interactive character control system (as an application of our algorithmic framework) can be used by computer animators, e.g., for video games and animators.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Healthcare

 
Description The outcomes of this project have been reported to the scientific community at various international conferences including IEEE Conference on Computer Vision and Pattern Recognition (CVPR), International Conference on Computer Vision (ICCV), European Conference on Computer Vision (ECCV), British Machine Vision Conference (BMVC), and SIGGRAPH Asia. The software packages and datasets associated with this project have been made publicly available at ------- [1] http://mloss.org/software/view/644/ [2] https://people.mpi-inf.mpg.de/~kkim/hreg/index.html [3] https://people.mpi-inf.mpg.de/~kkim/relreg/index.html [4] https://people.mpi-inf.mpg.de/~kkim/diff/index.html [5] https://github.com/saquil/RankCGAN ------- After successfully delivering the scientific outcomes, we are continuing to work on this project with the aim of producing bigger economic and societal impacts. Our techniques can be used in many applications that use/require imagery data exploration and editing. Potential beneficiaries include creative industries, e.g., in graphics, game, and animation, and (end-users of) online shopping and media recommendation. Our most recent impact activity involves using the database ranking algorithm (developed via the current project) for designing data exploration and synthesis tools in the Human-Computer Interaction graduate course of UNIST, Korea (PI's current affiliation).
First Year Of Impact 2018
Sector Education
Impact Types Cultural,Economic

 
Title Context-guided diffusion algorithm on graphs 
Description Existing approaches for diffusion on graphs, e.g., for label propagation, are mainly focused on isotropic diffusion, which is induced by the commonly-used graph Laplacian regularizer. This algorithm facilitates anisotropic diffusion on graphs and the corresponding label propagation. The algorithm is obtained by constructing positive definite diffusivity operators on the vector bundles of Riemannian manifolds, and discretising them to diffusivity operators on graphs. The algorithm can be used in semi-supervised learning, spectral clustering, and spectral embedding of graph structured data. 
Type Of Material Computer model/algorithm 
Year Produced 2016 
Provided To Others? Yes  
Impact Our new data analysis framework has been presented at an academic conference: International Conference on Computer Vision 2015. 
URL https://people.mpi-inf.mpg.de/~kkim/diff/index.html
 
Title Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking 
Description Large databases are often organized by hand-labeled metadata-or criteria-which are expensive to collect. We can use unsupervised learning to model database variation, but these models are often high dimensional, complex to parameterize, or require expert knowledge. We learn low-dimensional continuous criteria via interactive ranking, so that the novice user need only describe the relative ordering of examples. This is formed as semi-supervised label propagation in which we maximize the information gained from a limited number of examples. Further, we actively suggest data points to the user to rank in a more informative way than existing work. Our efficient approach allows users to interactively organize thousands of data points along 1D and 2D continuous sliders. We experiment with databases of imagery and geometry to demonstrate that our tool is useful for quickly assessing and organizing the content of large databases. 
Type Of Material Computer model/algorithm 
Year Produced 2017 
Provided To Others? Yes  
Impact Our new graph data analysis framework has been presented at an academic conference: British Machine Vision Conference (BMVC) 2017 
URL https://jamestompkin.com/assets/projects/criteriasliders/
 
Title Joint diffusion of graph functions and Laplacians 
Description This algorithm is an instantiation of a discrete regularizer on a graph's diffusivity operator. The algorithm is grounded in the theory that regularizing the diffusivity operator corresponds to regularizing the metric on Riemannian manifolds, which further corresponds to regularizing the anisotropic Laplace-Beltrami operator. This new algorithm significantly improves existing semi-supervised learning and ranking algorithms on graphs. 
Type Of Material Computer model/algorithm 
Year Produced 2016 
Provided To Others? Yes  
Impact Our new graph data analysis framework has been presented at an academic conference: European Conference on Computer Vision ECCV 2016. 
URL https://people.mpi-inf.mpg.de/~kkim/ldiff/index.html
 
Title Ranking CGANs: Subjective Control over Semantic Image Attributes 
Description Given pairwise comparisons of images, our model, called RankCGAN, learns 1) to rank images using a subjective measure; and 2) a generative model that can be controlled by that measure. RankCGAN enables users to synthesize images based on subjective measures of interest. 
Type Of Material Computer model/algorithm 
Year Produced 2018 
Provided To Others? Yes  
Impact Our new graph data analysis framework has been presented at an academic conference: British Machine Vision Conference BMVC 2018. 
URL https://github.com/saquil/RankCGAN
 
Description User-centric image organization: collaboration with Brown University and Max Planck Institute for Informatics (Note: grant transferred from EP/M00533X/1) 
Organisation Brown University
Country United States 
Sector Academic/University 
PI Contribution In joint research on user-centric image organization, I contributed by developing new machine learning algorithms tailored for data organization: 1) Our new high-order regularization algorithms do not suffer from the degeneracy of conventional graph Laplacian-type regularization algorithms (see [1,6] Outputs section) and they facilitate the propagation of user-provided data annotations to the entire image database; 2) Our new predictor combination approach enables us to combine multiple heterogeneous predictors made by users benefiting from the sharing of knowledge gained from individual tasks across different data organization problems and users (see [5] Outputs section).
Collaborator Contribution Prof. Tompkin at Brown University is an expert in computational photography and videography, and user-centric contents generation. He contributed by designing interactive systems that enable users to communicate their own data exploration criteria (supported by our new machine learning algorithms). Prof. Christian Theobalt at Max Planck Institute for Informatics who is an expert in human motion analysis, 3D image analysis and synthesis contributed with his significant technical and scientific knowledge in developing a framework that enables users to design their own gesture-based, animated character control interfaces [4].
Impact This collaboration has led to several important published (plus unpublished so far) outcomes: [1] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Local high-order regularization on data manifolds, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [2] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Semi-supervised learning with explicit relationship regularization, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [3] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Context-guided diffusion for label propagation on graphs, Proc. International Conference on Computer Vision, 2015. [4] H. Rhodin, J. Tompkin, K. I. Kim, E. d. Aguiar, H. Pfister, H.-P. Seidel, and C. Theobalt, Generalizing Wave Gestures from Sparse Examples for Real-time Character Control, ACM Trans. Graphics (Proc. SIGGRAPH), 2015. [5] K. I. Kim, J. tompkin, and C. Richardt, Predictor Combination at Test Time, Proc. International Conference on Computer Vision, 2017. [6] J. tompkin, K. I. Kim, H. Pfister, and C. Theobalt, Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking, Proc. British Machine Vision Conference, 2017.
Start Year 2015
 
Description User-centric image organization: collaboration with Brown University and Max Planck Institute for Informatics (Note: grant transferred from EP/M00533X/1) 
Organisation Max Planck Society
Department Max Planck Institute for Informatics
Country Germany 
Sector Charity/Non Profit 
PI Contribution In joint research on user-centric image organization, I contributed by developing new machine learning algorithms tailored for data organization: 1) Our new high-order regularization algorithms do not suffer from the degeneracy of conventional graph Laplacian-type regularization algorithms (see [1,6] Outputs section) and they facilitate the propagation of user-provided data annotations to the entire image database; 2) Our new predictor combination approach enables us to combine multiple heterogeneous predictors made by users benefiting from the sharing of knowledge gained from individual tasks across different data organization problems and users (see [5] Outputs section).
Collaborator Contribution Prof. Tompkin at Brown University is an expert in computational photography and videography, and user-centric contents generation. He contributed by designing interactive systems that enable users to communicate their own data exploration criteria (supported by our new machine learning algorithms). Prof. Christian Theobalt at Max Planck Institute for Informatics who is an expert in human motion analysis, 3D image analysis and synthesis contributed with his significant technical and scientific knowledge in developing a framework that enables users to design their own gesture-based, animated character control interfaces [4].
Impact This collaboration has led to several important published (plus unpublished so far) outcomes: [1] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Local high-order regularization on data manifolds, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [2] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Semi-supervised learning with explicit relationship regularization, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [3] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Context-guided diffusion for label propagation on graphs, Proc. International Conference on Computer Vision, 2015. [4] H. Rhodin, J. Tompkin, K. I. Kim, E. d. Aguiar, H. Pfister, H.-P. Seidel, and C. Theobalt, Generalizing Wave Gestures from Sparse Examples for Real-time Character Control, ACM Trans. Graphics (Proc. SIGGRAPH), 2015. [5] K. I. Kim, J. tompkin, and C. Richardt, Predictor Combination at Test Time, Proc. International Conference on Computer Vision, 2017. [6] J. tompkin, K. I. Kim, H. Pfister, and C. Theobalt, Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking, Proc. British Machine Vision Conference, 2017.
Start Year 2015
 
Title Local high-order regularization on data manifolds 
Description Our software implements a new regularizer which is globally high order and is also sparse for efficient computation in semi-supervised learning applications. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact The software has been released under GPL on the machine learning open source software forum (mloss.org). 
URL http://mloss.org/software/search/?searchterm=Local+high+order+regularization&post=
 
Description Presentation at Dongseo University 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact I gave a talk on our semantics-level image exploration work at Dongseo University, Korea as a part of a recently started research collaboration with the Dept. of Digital Contents.
Year(s) Of Engagement Activity 2016
 
Description Presentation at Imperial College London 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact I was invited to Imperial College London to present our work on user-centric imagery framework: This led to research collaboration with Dr. Tae-Kyun Kim at Imperial Computer Vision & Learning Lab.
Year(s) Of Engagement Activity 2016
 
Description Presentation at Kyungpook National University 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact I gave a talk on our semantics-level image exploration work at Kyungpook National University, Korea.
Year(s) Of Engagement Activity 2016
 
Description Research presentation at Brno University of Technology 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Undergraduate students
Results and Impact I gave a research presentation on our personalized image manipulation and semantics-level image exploration work at Brno University of Technology, Czech Republic.
Year(s) Of Engagement Activity 2017
URL http://vgs-it.fit.vutbr.cz/2017/05/03/kwang-in-kim-toward-intuitive-imagery-user-friendly-manipulati...
 
Description Research presentation at KAIST, Korea 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact I gave a research presentation on our personalized image exploration project at KAIST, Korea
Year(s) Of Engagement Activity 2019
 
Description Research presentation at UNIST, Korea 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact I gave a research presentation on our personalized image exploration project at UNIST, Korea
Year(s) Of Engagement Activity 2019