Personalized Exploration of Imagery Database

Lead Research Organisation: University of Bath

Department Name: Computer Science

Abstract

"I want to see jackets which are stylish, but not too fancy. Say, 70% stylish."

This project aims to develop new techniques which can significantly improve data browsing experience in online shopping, dating, media recommendations, and many other applications.

Two very common ways to explore large collections of imagery items, for instance, in online shopping, are to browse a hierarchy of items and to search with textual keywords. The returned results are browsed in lists, typically ordered by popularity. However, popularity is defined across all users as one homogeneous peoples, and users cannot sort by their own subjective criteria, e.g., by their own personal `style' for clothes; What is `stylish' to one person will be passe to another. Furthermore, there is no way to place items on a continuous scale, where the criteria amount for each item is known, e.g., how stylish a particular piece of clothing is to a user.

Our goal is to develop new techniques which enable users to organize and explore imagery data based on their own subjective criteria at a high semantic level. This is a challenging problem: Many criteria are hard to quantify and a user may not even be able to articulate the criteria.
We face this challenge by observing that even though users may not be able to specify their criteria quantitatively, or even fully describe them, they are still able to communicate their own notions by providing examples, e.g., "this shoe is cooler than that one". Our goal is to build an algorithm that arranges a large corpus of visual data according to these examples. Once built, the arranged data can be browsed with an interface that exploits the learned criteria to navigate the continuous scale.

The key contributions of the proposed research will include 1) exploring different modes of user interaction and elaborate on reflecting the resulting knowledge to 2) a new algorithm that, by breaking the limitations of existing approaches, effectively and efficiently learns from user-provided examples and thereby makes personalized data exploration realistic.

Planned Impact

This project aims to develop new techniques which can significantly improve data browsing experience by enabling users to organize data collections based on their own subjective, semantic-level criteria. If successful, these techniques can be directly used in many applications that use/require data exploration. In particular, online shopping will be the biggest beneficiary of this research. The UK is one of the largest and ever growing markets in online shopping: As of November 2013, online shopping increased 10% over the year 2012 and revenues reached a monthly record of £10.1 billion. Specific application scenarios include
1) Finding the perfect chair for a user's room from thousands of possibilities across different styles, by ranking a small subset of chairs by preference.
2) Finding a tasty wine (in terms of personal preference) by trying a small number of different wines: Even novice users could easily establish their shopping portfolio, without having to gain knowledge of domain-specific keywords such as `Tannin' and `Tartaric Acid'.
This research will therefore, contribute to qualitative and quantitative growth of the online shopping market in the UK by attracting users with a significantly improved experience.

Online shopping is only an example of many data browsing applications. Additional application examples are
- Online dating (£170 million market in the UK): Attractiveness is personal-- ranking a small subset of people would help to tailor the personal matches you received by a personal appearance attractiveness scale.
- Media recommendation (e.g., Netflix, iTunes, Kindle): Ranking a few films, albums, or books in an online library to quickly organize the entire collection by your preference.

Our strategy for realizing such a browsing system is to make advances in machine learning, computer vision, and HCI. In particular, one of key technical contributions of this project will be an improved algorithm for semi-supervised learning. Since semi-supervised learning is nowadays extensively used in diverse areas including data mining, social networks analysis, robotics, and genetics, in the long-term, this project will impact on a much broader range of economic and academic activities which may benefit from these techniques.

Furthermore, our techniques will have a societal impact by helping people save time: if successful, users would no longer have to spend hours hunting for just the right item. Users could sort by their particular criteria, and have a good chance of finding it within a small amount of time. Collectively, this saves people a lot of time, and makes the shopping experience or more generally, the data browsing experience, much more pleasant.

Funded Value:

£29,701

Funded Period:

Aug 16 - May 17

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/M00533X/2

Principal Investigator:

Kwang In Kim

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Artificial Intelligence (25%)

Computer Graphics & Visual. (25%)

Human-Computer Interactions (25%)

Image & Vision Computing (25%)

Organisations

People	ORCID iD
Kwang In Kim (Principal Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Baek S (2018) Augmented Skeleton Space Transfer for Depth-Based Hand Pose Estimation

Dev K. (2016) Improving style similarity metrics of 3D shapes

Kim K (2016) Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part V

Kim K (2017) Predictor Combination at Test Time

Mejjati Y (2018) Unsupervised Attention-guided Image to Image Translation

Mejjati Y (2018) Multi-task Learning by Maximizing Statistical Dependence

Saquil Y (2018) Ranking CGANs: Subjective Control over Semantic Image Attributes

Saquil Y. (2019) Ranking cGANs: Subjective control over semantic image attributes in British Machine Vision Conference 2018, BMVC 2018

Tompkin J (2017) Criteria sliders: learning continuous database criteria via interactive ranking

Tompkin J (2017) Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking

Related Projects

Project Reference	Relationship	Related To	Start	End	Award Value
EP/M00533X/1			01/03/2015	30/05/2016	£97,878
EP/M00533X/2	Transfer	EP/M00533X/1	31/08/2016	30/05/2017	£29,702

Key Findings
Impact Summary
Research Databases and Models
Collaboration
Software and Technical Products
Engagement Activities


Description	The goal of this project is to develop new techniques which enable users to organize and explore imagery data based on their own subjective criteria at a high semantic level. We have developed a set of machine learning algorithms to allow users to communicate their own criteria without having to know how that criteria might be formed or described at the data level [1,3,5,6,7]. One challenge in approaching this problem is the lack of ground-truth labels on which machine learning algorithms can be trained since users would not want to spend hours to provide labels. Our approach is to leverage the machine learning process from a set of labeled data points with a large set of unlabeled data points (i.e., data instances that are not explicitly labeled by users). It has been recently found that the well-established graph Laplacian-based semi-supervised algorithms can overfit to data, i.e., we obtain a machine which explains the training data perfectly but cannot generalize to new data. We developed new semi-supervised learning algorithms that circumvent this problem and enable stable generalization of given labels to the entire unlabeled datasets [3,5,6,7]. Furthermore, we applied these algorithms and the principle of user-guided machine learning to developing 1) an interactive system that enables users to personalize the character controller for synthesizing animations in video games and animations [8]. 2) a new 3D shape data exploration framework that enables users define their own distance measures facilitating semantics-level data retrieval [4]. and 2) to facilitate the sharing of knowledge gained from individual tasks across different data organization problems and users [2] Another key finding of this project is a new algorithmic framework that facilitates the sharing of knowledge gained from individual tasks across different data organization problems and users [2]: Many real-world data organization problems involve learning several tasks exhibiting mutual dependence, e.g., the task of retrieving `smart' cars in a database is correlated with searching `sporty' cars. However, existing attempts to `combine' such multiple tasks are limited in that they require known mathematical forms of individual task executors and therefore, they are not directly applicable to aggregating decisions from pre-compiled software libraries or human evaluations (with unknown mathematical forms). We developed a new form-independent framework that automatically discovers latent dependence across multiple heterogeneous tasks and benefits from combining them without requiring access to their mathematical forms [2]. This significantly broadens the application spectrum of multiple task combination approaches. Publications [1] J. Tompkin, K. I. Kim, H. Pfister, and C. Theobalt, Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking, Proc. BMVC, 2017. [2] K. I. Kim, J. Tompkin, and C Richardt, Predictor Combination at Test Time, Proc. ICCV, 2017. [3] K. I. Kim, Semi-supervised Learning based on Joint Diffusion of Graph Functions and Laplacians, Proc. ECCV, 2016. [4] K. Dev, K. I. Kim, N. Villar, and M. Lau, Improving style similarity metrics of 3D shapes, Proc. Graphics Interface, 2016. [5] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Semi-supervised learning with explicit relationship regularization, Proc. CVPR, 2015. [6] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Local high-order regularization on data manifolds, Proc. CVPR, 2015. [7] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Context-guided diffusion for label propagation on graphs, Proc. ICCV, 2015. [8] H. Rhodin, J. Tompkin, K. I. Kim, E. de Aguiar, H.-P. Seidel, and C. Theobalt, Generalizing wave gestures from sparse examples for real-time character control, ACM Trans. Graphics (Proc. SIGGRAPH Asia), 2015.
Exploitation Route	Our new semi-supervised learning algorithms can be directly applied to various regression, classification, and ranking problems. The research outcomes have been disseminated through high-quality scientific conferences, journals, and open source software packages: ------- https://people.mpi-inf.mpg.de/~kkim/hreg/index.html https://people.mpi-inf.mpg.de/~kkim/relreg/index.html https://people.mpi-inf.mpg.de/~kkim/diff/index.html ------- Our interactive character control system (as an application of our algorithmic framework) can be used by computer animators, e.g., for video games and animators.
Sectors	Creative Economy Digital/Communication/Information Technologies (including Software) Healthcare


Description	The outcomes of this project have been reported to the scientific community at various international conferences including IEEE Conference on Computer Vision and Pattern Recognition (CVPR), International Conference on Computer Vision (ICCV), European Conference on Computer Vision (ECCV), British Machine Vision Conference (BMVC), and SIGGRAPH Asia. The software packages and datasets associated with this project have been made publicly available at ------- [1] http://mloss.org/software/view/644/ [2] https://people.mpi-inf.mpg.de/~kkim/hreg/index.html [3] https://people.mpi-inf.mpg.de/~kkim/relreg/index.html [4] https://people.mpi-inf.mpg.de/~kkim/diff/index.html [5] https://github.com/saquil/RankCGAN ------- After successfully delivering the scientific outcomes, we are continuing to work on this project with the aim of producing bigger economic and societal impacts. Our techniques can be used in many applications that use/require imagery data exploration and editing. Potential beneficiaries include creative industries, e.g., in graphics, game, and animation, and (end-users of) online shopping and media recommendation. Our most recent impact activity involves using the database ranking algorithm (developed via the current project) for designing data exploration and synthesis tools in the Human-Computer Interaction graduate course of UNIST, Korea (PI's current affiliation).
First Year Of Impact	2018
Sector	Education
Impact Types	Cultural Economic


Title	Context-guided diffusion algorithm on graphs
Description	Existing approaches for diffusion on graphs, e.g., for label propagation, are mainly focused on isotropic diffusion, which is induced by the commonly-used graph Laplacian regularizer. This algorithm facilitates anisotropic diffusion on graphs and the corresponding label propagation. The algorithm is obtained by constructing positive definite diffusivity operators on the vector bundles of Riemannian manifolds, and discretising them to diffusivity operators on graphs. The algorithm can be used in semi-supervised learning, spectral clustering, and spectral embedding of graph structured data.
Type Of Material	Computer model/algorithm
Year Produced	2016
Provided To Others?	Yes
Impact	Our new data analysis framework has been presented at an academic conference: International Conference on Computer Vision 2015.
URL	https://people.mpi-inf.mpg.de/~kkim/diff/index.html


Title	Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking
Description	Large databases are often organized by hand-labeled metadata-or criteria-which are expensive to collect. We can use unsupervised learning to model database variation, but these models are often high dimensional, complex to parameterize, or require expert knowledge. We learn low-dimensional continuous criteria via interactive ranking, so that the novice user need only describe the relative ordering of examples. This is formed as semi-supervised label propagation in which we maximize the information gained from a limited number of examples. Further, we actively suggest data points to the user to rank in a more informative way than existing work. Our efficient approach allows users to interactively organize thousands of data points along 1D and 2D continuous sliders. We experiment with databases of imagery and geometry to demonstrate that our tool is useful for quickly assessing and organizing the content of large databases.
Type Of Material	Computer model/algorithm
Year Produced	2017
Provided To Others?	Yes
Impact	Our new graph data analysis framework has been presented at an academic conference: British Machine Vision Conference (BMVC) 2017
URL	https://jamestompkin.com/assets/projects/criteriasliders/


Title	Joint diffusion of graph functions and Laplacians
Description	This algorithm is an instantiation of a discrete regularizer on a graph's diffusivity operator. The algorithm is grounded in the theory that regularizing the diffusivity operator corresponds to regularizing the metric on Riemannian manifolds, which further corresponds to regularizing the anisotropic Laplace-Beltrami operator. This new algorithm significantly improves existing semi-supervised learning and ranking algorithms on graphs.
Type Of Material	Computer model/algorithm
Year Produced	2016
Provided To Others?	Yes
Impact	Our new graph data analysis framework has been presented at an academic conference: European Conference on Computer Vision ECCV 2016.
URL	https://people.mpi-inf.mpg.de/~kkim/ldiff/index.html


Title	Ranking CGANs: Subjective Control over Semantic Image Attributes
Description	Given pairwise comparisons of images, our model, called RankCGAN, learns 1) to rank images using a subjective measure; and 2) a generative model that can be controlled by that measure. RankCGAN enables users to synthesize images based on subjective measures of interest.
Type Of Material	Computer model/algorithm
Year Produced	2018
Provided To Others?	Yes
Impact	Our new graph data analysis framework has been presented at an academic conference: British Machine Vision Conference BMVC 2018.
URL	https://github.com/saquil/RankCGAN


Description	User-centric image organization: collaboration with Brown University and Max Planck Institute for Informatics (Note: grant transferred from EP/M00533X/1)
Organisation	Brown University
Country	United States
Sector	Academic/University
PI Contribution	In joint research on user-centric image organization, I contributed by developing new machine learning algorithms tailored for data organization: 1) Our new high-order regularization algorithms do not suffer from the degeneracy of conventional graph Laplacian-type regularization algorithms (see [1,6] Outputs section) and they facilitate the propagation of user-provided data annotations to the entire image database; 2) Our new predictor combination approach enables us to combine multiple heterogeneous predictors made by users benefiting from the sharing of knowledge gained from individual tasks across different data organization problems and users (see [5] Outputs section).
Collaborator Contribution	Prof. Tompkin at Brown University is an expert in computational photography and videography, and user-centric contents generation. He contributed by designing interactive systems that enable users to communicate their own data exploration criteria (supported by our new machine learning algorithms). Prof. Christian Theobalt at Max Planck Institute for Informatics who is an expert in human motion analysis, 3D image analysis and synthesis contributed with his significant technical and scientific knowledge in developing a framework that enables users to design their own gesture-based, animated character control interfaces [4].
Impact	This collaboration has led to several important published (plus unpublished so far) outcomes: [1] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Local high-order regularization on data manifolds, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [2] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Semi-supervised learning with explicit relationship regularization, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [3] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Context-guided diffusion for label propagation on graphs, Proc. International Conference on Computer Vision, 2015. [4] H. Rhodin, J. Tompkin, K. I. Kim, E. d. Aguiar, H. Pfister, H.-P. Seidel, and C. Theobalt, Generalizing Wave Gestures from Sparse Examples for Real-time Character Control, ACM Trans. Graphics (Proc. SIGGRAPH), 2015. [5] K. I. Kim, J. tompkin, and C. Richardt, Predictor Combination at Test Time, Proc. International Conference on Computer Vision, 2017. [6] J. tompkin, K. I. Kim, H. Pfister, and C. Theobalt, Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking, Proc. British Machine Vision Conference, 2017.
Start Year	2015


Description	User-centric image organization: collaboration with Brown University and Max Planck Institute for Informatics (Note: grant transferred from EP/M00533X/1)
Organisation	Max Planck Society
Department	Max Planck Institute for Informatics
Country	Germany
Sector	Charity/Non Profit
PI Contribution	In joint research on user-centric image organization, I contributed by developing new machine learning algorithms tailored for data organization: 1) Our new high-order regularization algorithms do not suffer from the degeneracy of conventional graph Laplacian-type regularization algorithms (see [1,6] Outputs section) and they facilitate the propagation of user-provided data annotations to the entire image database; 2) Our new predictor combination approach enables us to combine multiple heterogeneous predictors made by users benefiting from the sharing of knowledge gained from individual tasks across different data organization problems and users (see [5] Outputs section).
Collaborator Contribution	Prof. Tompkin at Brown University is an expert in computational photography and videography, and user-centric contents generation. He contributed by designing interactive systems that enable users to communicate their own data exploration criteria (supported by our new machine learning algorithms). Prof. Christian Theobalt at Max Planck Institute for Informatics who is an expert in human motion analysis, 3D image analysis and synthesis contributed with his significant technical and scientific knowledge in developing a framework that enables users to design their own gesture-based, animated character control interfaces [4].
Impact	This collaboration has led to several important published (plus unpublished so far) outcomes: [1] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Local high-order regularization on data manifolds, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [2] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Semi-supervised learning with explicit relationship regularization, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [3] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Context-guided diffusion for label propagation on graphs, Proc. International Conference on Computer Vision, 2015. [4] H. Rhodin, J. Tompkin, K. I. Kim, E. d. Aguiar, H. Pfister, H.-P. Seidel, and C. Theobalt, Generalizing Wave Gestures from Sparse Examples for Real-time Character Control, ACM Trans. Graphics (Proc. SIGGRAPH), 2015. [5] K. I. Kim, J. tompkin, and C. Richardt, Predictor Combination at Test Time, Proc. International Conference on Computer Vision, 2017. [6] J. tompkin, K. I. Kim, H. Pfister, and C. Theobalt, Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking, Proc. British Machine Vision Conference, 2017.
Start Year	2015


Title	Local high-order regularization on data manifolds
Description	Our software implements a new regularizer which is globally high order and is also sparse for efficient computation in semi-supervised learning applications.
Type Of Technology	Software
Year Produced	2016
Open Source License?	Yes
Impact	The software has been released under GPL on the machine learning open source software forum (mloss.org).
URL	http://mloss.org/software/search/?searchterm=Local+high+order+regularization&post=


Description	Presentation at Dongseo University
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Undergraduate students
Results and Impact	I gave a talk on our semantics-level image exploration work at Dongseo University, Korea as a part of a recently started research collaboration with the Dept. of Digital Contents.
Year(s) Of Engagement Activity	2016


Description	Presentation at Imperial College London
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	I was invited to Imperial College London to present our work on user-centric imagery framework: This led to research collaboration with Dr. Tae-Kyun Kim at Imperial Computer Vision & Learning Lab.
Year(s) Of Engagement Activity	2016


Description	Presentation at Kyungpook National University
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	I gave a talk on our semantics-level image exploration work at Kyungpook National University, Korea.
Year(s) Of Engagement Activity	2016


Description	Research presentation at Brno University of Technology
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Undergraduate students
Results and Impact	I gave a research presentation on our personalized image manipulation and semantics-level image exploration work at Brno University of Technology, Czech Republic.
Year(s) Of Engagement Activity	2017
URL	http://vgs-it.fit.vutbr.cz/2017/05/03/kwang-in-kim-toward-intuitive-imagery-user-friendly-manipulati...


Description	Research presentation at KAIST, Korea
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	I gave a research presentation on our personalized image exploration project at KAIST, Korea
Year(s) Of Engagement Activity	2019


Description	Research presentation at UNIST, Korea
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Undergraduate students
Results and Impact	I gave a research presentation on our personalized image exploration project at UNIST, Korea
Year(s) Of Engagement Activity	2019