Personalized Exploration of Imagery Database
Lead Research Organisation:
University of Bath
Department Name: Computer Science
Abstract
"I want to see jackets which are stylish, but not too fancy. Say, 70% stylish."
This project aims to develop new techniques which can significantly improve data browsing experience in online shopping, dating, media recommendations, and many other applications.
Two very common ways to explore large collections of imagery items, for instance, in online shopping, are to browse a hierarchy of items and to search with textual keywords. The returned results are browsed in lists, typically ordered by popularity. However, popularity is defined across all users as one homogeneous peoples, and users cannot sort by their own subjective criteria, e.g., by their own personal `style' for clothes; What is `stylish' to one person will be passe to another. Furthermore, there is no way to place items on a continuous scale, where the criteria amount for each item is known, e.g., how stylish a particular piece of clothing is to a user.
Our goal is to develop new techniques which enable users to organize and explore imagery data based on their own subjective criteria at a high semantic level. This is a challenging problem: Many criteria are hard to quantify and a user may not even be able to articulate the criteria.
We face this challenge by observing that even though users may not be able to specify their criteria quantitatively, or even fully describe them, they are still able to communicate their own notions by providing examples, e.g., "this shoe is cooler than that one". Our goal is to build an algorithm that arranges a large corpus of visual data according to these examples. Once built, the arranged data can be browsed with an interface that exploits the learned criteria to navigate the continuous scale.
The key contributions of the proposed research will include 1) exploring different modes of user interaction and elaborate on reflecting the resulting knowledge to 2) a new algorithm that, by breaking the limitations of existing approaches, effectively and efficiently learns from user-provided examples and thereby makes personalized data exploration realistic.
This project aims to develop new techniques which can significantly improve data browsing experience in online shopping, dating, media recommendations, and many other applications.
Two very common ways to explore large collections of imagery items, for instance, in online shopping, are to browse a hierarchy of items and to search with textual keywords. The returned results are browsed in lists, typically ordered by popularity. However, popularity is defined across all users as one homogeneous peoples, and users cannot sort by their own subjective criteria, e.g., by their own personal `style' for clothes; What is `stylish' to one person will be passe to another. Furthermore, there is no way to place items on a continuous scale, where the criteria amount for each item is known, e.g., how stylish a particular piece of clothing is to a user.
Our goal is to develop new techniques which enable users to organize and explore imagery data based on their own subjective criteria at a high semantic level. This is a challenging problem: Many criteria are hard to quantify and a user may not even be able to articulate the criteria.
We face this challenge by observing that even though users may not be able to specify their criteria quantitatively, or even fully describe them, they are still able to communicate their own notions by providing examples, e.g., "this shoe is cooler than that one". Our goal is to build an algorithm that arranges a large corpus of visual data according to these examples. Once built, the arranged data can be browsed with an interface that exploits the learned criteria to navigate the continuous scale.
The key contributions of the proposed research will include 1) exploring different modes of user interaction and elaborate on reflecting the resulting knowledge to 2) a new algorithm that, by breaking the limitations of existing approaches, effectively and efficiently learns from user-provided examples and thereby makes personalized data exploration realistic.
Planned Impact
This project aims to develop new techniques which can significantly improve data browsing experience by enabling users to organize data collections based on their own subjective, semantic-level criteria. If successful, these techniques can be directly used in many applications that use/require data exploration. In particular, online shopping will be the biggest beneficiary of this research. The UK is one of the largest and ever growing markets in online shopping: As of November 2013, online shopping increased 10% over the year 2012 and revenues reached a monthly record of £10.1 billion. Specific application scenarios include
1) Finding the perfect chair for a user's room from thousands of possibilities across different styles, by ranking a small subset of chairs by preference.
2) Finding a tasty wine (in terms of personal preference) by trying a small number of different wines: Even novice users could easily establish their shopping portfolio, without having to gain knowledge of domain-specific keywords such as `Tannin' and `Tartaric Acid'.
This research will therefore, contribute to qualitative and quantitative growth of the online shopping market in the UK by attracting users with a significantly improved experience.
Online shopping is only an example of many data browsing applications. Additional application examples are
- Online dating (£170 million market in the UK): Attractiveness is personal-- ranking a small subset of people would help to tailor the personal matches you received by a personal appearance attractiveness scale.
- Media recommendation (e.g., Netflix, iTunes, Kindle): Ranking a few films, albums, or books in an online library to quickly organize the entire collection by your preference.
Our strategy for realizing such a browsing system is to make advances in machine learning, computer vision, and HCI. In particular, one of key technical contributions of this project will be an improved algorithm for semi-supervised learning. Since semi-supervised learning is nowadays extensively used in diverse areas including data mining, social networks analysis, robotics, and genetics, in the long-term, this project will impact on a much broader range of economic and academic activities which may benefit from these techniques.
Furthermore, our techniques will have a societal impact by helping people save time: if successful, users would no longer have to spend hours hunting for just the right item. Users could sort by their particular criteria, and have a good chance of finding it within a small amount of time. Collectively, this saves people a lot of time, and makes the shopping experience or more generally, the data browsing experience, much more pleasant.
1) Finding the perfect chair for a user's room from thousands of possibilities across different styles, by ranking a small subset of chairs by preference.
2) Finding a tasty wine (in terms of personal preference) by trying a small number of different wines: Even novice users could easily establish their shopping portfolio, without having to gain knowledge of domain-specific keywords such as `Tannin' and `Tartaric Acid'.
This research will therefore, contribute to qualitative and quantitative growth of the online shopping market in the UK by attracting users with a significantly improved experience.
Online shopping is only an example of many data browsing applications. Additional application examples are
- Online dating (£170 million market in the UK): Attractiveness is personal-- ranking a small subset of people would help to tailor the personal matches you received by a personal appearance attractiveness scale.
- Media recommendation (e.g., Netflix, iTunes, Kindle): Ranking a few films, albums, or books in an online library to quickly organize the entire collection by your preference.
Our strategy for realizing such a browsing system is to make advances in machine learning, computer vision, and HCI. In particular, one of key technical contributions of this project will be an improved algorithm for semi-supervised learning. Since semi-supervised learning is nowadays extensively used in diverse areas including data mining, social networks analysis, robotics, and genetics, in the long-term, this project will impact on a much broader range of economic and academic activities which may benefit from these techniques.
Furthermore, our techniques will have a societal impact by helping people save time: if successful, users would no longer have to spend hours hunting for just the right item. Users could sort by their particular criteria, and have a good chance of finding it within a small amount of time. Collectively, this saves people a lot of time, and makes the shopping experience or more generally, the data browsing experience, much more pleasant.
People |
ORCID iD |
Kwang In Kim (Principal Investigator) |
Publications
Saquil Yassir
(2018)
Ranking CGANs: Subjective Control over Semantic Image Attributes
in arXiv e-prints
Saquil Y.
(2019)
Ranking cGANs: Subjective control over semantic image attributes
in British Machine Vision Conference 2018, BMVC 2018
Dev K.
(2016)
Improving style similarity metrics of 3D shapes
Kim K
(2017)
Predictor Combination at Test Time
Mejjati Y
(2018)
Multi-task Learning by Maximizing Statistical Dependence
Description | The goal of this project is to develop new techniques which enable users to organize and explore imagery data based on their own subjective criteria at a high semantic level. We have developed a set of machine learning algorithms to allow users to communicate their own criteria without having to know how that criteria might be formed or described at the data level [1,3,5,6,7]. One challenge in approaching this problem is the lack of ground-truth labels on which machine learning algorithms can be trained since users would not want to spend hours to provide labels. Our approach is to leverage the machine learning process from a set of labeled data points with a large set of unlabeled data points (i.e., data instances that are not explicitly labeled by users). It has been recently found that the well-established graph Laplacian-based semi-supervised algorithms can overfit to data, i.e., we obtain a machine which explains the training data perfectly but cannot generalize to new data. We developed new semi-supervised learning algorithms that circumvent this problem and enable stable generalization of given labels to the entire unlabeled datasets [3,5,6,7]. Furthermore, we applied these algorithms and the principle of user-guided machine learning to developing 1) an interactive system that enables users to personalize the character controller for synthesizing animations in video games and animations [8]. 2) a new 3D shape data exploration framework that enables users define their own distance measures facilitating semantics-level data retrieval [4]. and 2) to facilitate the sharing of knowledge gained from individual tasks across different data organization problems and users [2] Another key finding of this project is a new algorithmic framework that facilitates the sharing of knowledge gained from individual tasks across different data organization problems and users [2]: Many real-world data organization problems involve learning several tasks exhibiting mutual dependence, e.g., the task of retrieving `smart' cars in a database is correlated with searching `sporty' cars. However, existing attempts to `combine' such multiple tasks are limited in that they require known mathematical forms of individual task executors and therefore, they are not directly applicable to aggregating decisions from pre-compiled software libraries or human evaluations (with unknown mathematical forms). We developed a new form-independent framework that automatically discovers latent dependence across multiple heterogeneous tasks and benefits from combining them without requiring access to their mathematical forms [2]. This significantly broadens the application spectrum of multiple task combination approaches. Publications [1] J. Tompkin, K. I. Kim, H. Pfister, and C. Theobalt, Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking, Proc. BMVC, 2017. [2] K. I. Kim, J. Tompkin, and C Richardt, Predictor Combination at Test Time, Proc. ICCV, 2017. [3] K. I. Kim, Semi-supervised Learning based on Joint Diffusion of Graph Functions and Laplacians, Proc. ECCV, 2016. [4] K. Dev, K. I. Kim, N. Villar, and M. Lau, Improving style similarity metrics of 3D shapes, Proc. Graphics Interface, 2016. [5] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Semi-supervised learning with explicit relationship regularization, Proc. CVPR, 2015. [6] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Local high-order regularization on data manifolds, Proc. CVPR, 2015. [7] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Context-guided diffusion for label propagation on graphs, Proc. ICCV, 2015. [8] H. Rhodin, J. Tompkin, K. I. Kim, E. de Aguiar, H.-P. Seidel, and C. Theobalt, Generalizing wave gestures from sparse examples for real-time character control, ACM Trans. Graphics (Proc. SIGGRAPH Asia), 2015. |
Exploitation Route | Our new semi-supervised learning algorithms can be directly applied to various regression, classification, and ranking problems. The research outcomes have been disseminated through high-quality scientific conferences, journals, and open source software packages: ------- https://people.mpi-inf.mpg.de/~kkim/hreg/index.html https://people.mpi-inf.mpg.de/~kkim/relreg/index.html https://people.mpi-inf.mpg.de/~kkim/diff/index.html ------- Our interactive character control system (as an application of our algorithmic framework) can be used by computer animators, e.g., for video games and animators. |
Sectors | Creative Economy,Digital/Communication/Information Technologies (including Software),Healthcare |
Description | The outcomes of this project have been reported to the scientific community at various international conferences including IEEE Conference on Computer Vision and Pattern Recognition (CVPR), International Conference on Computer Vision (ICCV), European Conference on Computer Vision (ECCV), British Machine Vision Conference (BMVC), and SIGGRAPH Asia. The software packages and datasets associated with this project have been made publicly available at ------- [1] http://mloss.org/software/view/644/ [2] https://people.mpi-inf.mpg.de/~kkim/hreg/index.html [3] https://people.mpi-inf.mpg.de/~kkim/relreg/index.html [4] https://people.mpi-inf.mpg.de/~kkim/diff/index.html [5] https://github.com/saquil/RankCGAN ------- After successfully delivering the scientific outcomes, we are continuing to work on this project with the aim of producing bigger economic and societal impacts. Our techniques can be used in many applications that use/require imagery data exploration and editing. Potential beneficiaries include creative industries, e.g., in graphics, game, and animation, and (end-users of) online shopping and media recommendation. Our most recent impact activity involves using the database ranking algorithm (developed via the current project) for designing data exploration and synthesis tools in the Human-Computer Interaction graduate course of UNIST, Korea (PI's current affiliation). |
First Year Of Impact | 2018 |
Sector | Education |
Impact Types | Cultural,Economic |
Title | Context-guided diffusion algorithm on graphs |
Description | Existing approaches for diffusion on graphs, e.g., for label propagation, are mainly focused on isotropic diffusion, which is induced by the commonly-used graph Laplacian regularizer. This algorithm facilitates anisotropic diffusion on graphs and the corresponding label propagation. The algorithm is obtained by constructing positive definite diffusivity operators on the vector bundles of Riemannian manifolds, and discretising them to diffusivity operators on graphs. The algorithm can be used in semi-supervised learning, spectral clustering, and spectral embedding of graph structured data. |
Type Of Material | Computer model/algorithm |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | Our new data analysis framework has been presented at an academic conference: International Conference on Computer Vision 2015. |
URL | https://people.mpi-inf.mpg.de/~kkim/diff/index.html |
Title | Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking |
Description | Large databases are often organized by hand-labeled metadata-or criteria-which are expensive to collect. We can use unsupervised learning to model database variation, but these models are often high dimensional, complex to parameterize, or require expert knowledge. We learn low-dimensional continuous criteria via interactive ranking, so that the novice user need only describe the relative ordering of examples. This is formed as semi-supervised label propagation in which we maximize the information gained from a limited number of examples. Further, we actively suggest data points to the user to rank in a more informative way than existing work. Our efficient approach allows users to interactively organize thousands of data points along 1D and 2D continuous sliders. We experiment with databases of imagery and geometry to demonstrate that our tool is useful for quickly assessing and organizing the content of large databases. |
Type Of Material | Computer model/algorithm |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | Our new graph data analysis framework has been presented at an academic conference: British Machine Vision Conference (BMVC) 2017 |
URL | https://jamestompkin.com/assets/projects/criteriasliders/ |
Title | Joint diffusion of graph functions and Laplacians |
Description | This algorithm is an instantiation of a discrete regularizer on a graph's diffusivity operator. The algorithm is grounded in the theory that regularizing the diffusivity operator corresponds to regularizing the metric on Riemannian manifolds, which further corresponds to regularizing the anisotropic Laplace-Beltrami operator. This new algorithm significantly improves existing semi-supervised learning and ranking algorithms on graphs. |
Type Of Material | Computer model/algorithm |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | Our new graph data analysis framework has been presented at an academic conference: European Conference on Computer Vision ECCV 2016. |
URL | https://people.mpi-inf.mpg.de/~kkim/ldiff/index.html |
Title | Ranking CGANs: Subjective Control over Semantic Image Attributes |
Description | Given pairwise comparisons of images, our model, called RankCGAN, learns 1) to rank images using a subjective measure; and 2) a generative model that can be controlled by that measure. RankCGAN enables users to synthesize images based on subjective measures of interest. |
Type Of Material | Computer model/algorithm |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | Our new graph data analysis framework has been presented at an academic conference: British Machine Vision Conference BMVC 2018. |
URL | https://github.com/saquil/RankCGAN |
Description | User-centric image organization: collaboration with Brown University and Max Planck Institute for Informatics (Note: grant transferred from EP/M00533X/1) |
Organisation | Brown University |
Country | United States |
Sector | Academic/University |
PI Contribution | In joint research on user-centric image organization, I contributed by developing new machine learning algorithms tailored for data organization: 1) Our new high-order regularization algorithms do not suffer from the degeneracy of conventional graph Laplacian-type regularization algorithms (see [1,6] Outputs section) and they facilitate the propagation of user-provided data annotations to the entire image database; 2) Our new predictor combination approach enables us to combine multiple heterogeneous predictors made by users benefiting from the sharing of knowledge gained from individual tasks across different data organization problems and users (see [5] Outputs section). |
Collaborator Contribution | Prof. Tompkin at Brown University is an expert in computational photography and videography, and user-centric contents generation. He contributed by designing interactive systems that enable users to communicate their own data exploration criteria (supported by our new machine learning algorithms). Prof. Christian Theobalt at Max Planck Institute for Informatics who is an expert in human motion analysis, 3D image analysis and synthesis contributed with his significant technical and scientific knowledge in developing a framework that enables users to design their own gesture-based, animated character control interfaces [4]. |
Impact | This collaboration has led to several important published (plus unpublished so far) outcomes: [1] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Local high-order regularization on data manifolds, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [2] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Semi-supervised learning with explicit relationship regularization, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [3] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Context-guided diffusion for label propagation on graphs, Proc. International Conference on Computer Vision, 2015. [4] H. Rhodin, J. Tompkin, K. I. Kim, E. d. Aguiar, H. Pfister, H.-P. Seidel, and C. Theobalt, Generalizing Wave Gestures from Sparse Examples for Real-time Character Control, ACM Trans. Graphics (Proc. SIGGRAPH), 2015. [5] K. I. Kim, J. tompkin, and C. Richardt, Predictor Combination at Test Time, Proc. International Conference on Computer Vision, 2017. [6] J. tompkin, K. I. Kim, H. Pfister, and C. Theobalt, Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking, Proc. British Machine Vision Conference, 2017. |
Start Year | 2015 |
Description | User-centric image organization: collaboration with Brown University and Max Planck Institute for Informatics (Note: grant transferred from EP/M00533X/1) |
Organisation | Max Planck Society |
Department | Max Planck Institute for Informatics |
Country | Germany |
Sector | Charity/Non Profit |
PI Contribution | In joint research on user-centric image organization, I contributed by developing new machine learning algorithms tailored for data organization: 1) Our new high-order regularization algorithms do not suffer from the degeneracy of conventional graph Laplacian-type regularization algorithms (see [1,6] Outputs section) and they facilitate the propagation of user-provided data annotations to the entire image database; 2) Our new predictor combination approach enables us to combine multiple heterogeneous predictors made by users benefiting from the sharing of knowledge gained from individual tasks across different data organization problems and users (see [5] Outputs section). |
Collaborator Contribution | Prof. Tompkin at Brown University is an expert in computational photography and videography, and user-centric contents generation. He contributed by designing interactive systems that enable users to communicate their own data exploration criteria (supported by our new machine learning algorithms). Prof. Christian Theobalt at Max Planck Institute for Informatics who is an expert in human motion analysis, 3D image analysis and synthesis contributed with his significant technical and scientific knowledge in developing a framework that enables users to design their own gesture-based, animated character control interfaces [4]. |
Impact | This collaboration has led to several important published (plus unpublished so far) outcomes: [1] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Local high-order regularization on data manifolds, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [2] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Semi-supervised learning with explicit relationship regularization, Proc. IEEE Computer Vision and Pattern Recognition, 2015. [3] K. I. Kim, J. Tompkin, H. Pfister, and C. Theobalt, Context-guided diffusion for label propagation on graphs, Proc. International Conference on Computer Vision, 2015. [4] H. Rhodin, J. Tompkin, K. I. Kim, E. d. Aguiar, H. Pfister, H.-P. Seidel, and C. Theobalt, Generalizing Wave Gestures from Sparse Examples for Real-time Character Control, ACM Trans. Graphics (Proc. SIGGRAPH), 2015. [5] K. I. Kim, J. tompkin, and C. Richardt, Predictor Combination at Test Time, Proc. International Conference on Computer Vision, 2017. [6] J. tompkin, K. I. Kim, H. Pfister, and C. Theobalt, Criteria Sliders: Learning Continuous Database Criteria via Interactive Ranking, Proc. British Machine Vision Conference, 2017. |
Start Year | 2015 |
Title | Local high-order regularization on data manifolds |
Description | Our software implements a new regularizer which is globally high order and is also sparse for efficient computation in semi-supervised learning applications. |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | The software has been released under GPL on the machine learning open source software forum (mloss.org). |
URL | http://mloss.org/software/search/?searchterm=Local+high+order+regularization&post= |
Description | Presentation at Dongseo University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Undergraduate students |
Results and Impact | I gave a talk on our semantics-level image exploration work at Dongseo University, Korea as a part of a recently started research collaboration with the Dept. of Digital Contents. |
Year(s) Of Engagement Activity | 2016 |
Description | Presentation at Imperial College London |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | I was invited to Imperial College London to present our work on user-centric imagery framework: This led to research collaboration with Dr. Tae-Kyun Kim at Imperial Computer Vision & Learning Lab. |
Year(s) Of Engagement Activity | 2016 |
Description | Presentation at Kyungpook National University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | I gave a talk on our semantics-level image exploration work at Kyungpook National University, Korea. |
Year(s) Of Engagement Activity | 2016 |
Description | Research presentation at Brno University of Technology |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Undergraduate students |
Results and Impact | I gave a research presentation on our personalized image manipulation and semantics-level image exploration work at Brno University of Technology, Czech Republic. |
Year(s) Of Engagement Activity | 2017 |
URL | http://vgs-it.fit.vutbr.cz/2017/05/03/kwang-in-kim-toward-intuitive-imagery-user-friendly-manipulati... |
Description | Research presentation at KAIST, Korea |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | I gave a research presentation on our personalized image exploration project at KAIST, Korea |
Year(s) Of Engagement Activity | 2019 |
Description | Research presentation at UNIST, Korea |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Undergraduate students |
Results and Impact | I gave a research presentation on our personalized image exploration project at UNIST, Korea |
Year(s) Of Engagement Activity | 2019 |