Graphical Models for Relational Data: New Challenges and Solutions
Lead Research Organisation:
University of Cambridge
Department Name: Engineering
Abstract
Data often come under the form of objects and relationships: forinstance, a library consists of books that cite each other; proteinsbind to other proteins according to a variety of patterns; a networkof online customers is formed by people that indicate which othercustomers give reliable product recommendations. Such relationshipscan be used to predict the behavior and properties of each object. Forinstance, if a particular news article cites several sport articles,this is evidence that the particular article is likely to be aboutsports. We propose novel ways of exploring this relationalinformation. The first task is precisely how to predict the propertiesof an object (e.g., the class of a news article) based on otherobjects that that share a relationship with it (e.g., the otherarticles that are cited by or cite our target). We show that thereare important forms of relationship that are not properly treated bycurrent methods, and propose a new methodology to account for suchrelations. The second task focuses on ways to measure similarity ofrelational structures. For instance, if we know that two proteinsphysically interact inside a yeast cell, can we infer which otherpairs of proteins are linked in a similar way? We show how toformulate problems like this using probabilistic models, and developnovel ways of discovering patterns in relational data withapplications to a variety of real-world problems.
Organisations
Publications
H Wallach
(2010)
Learning the Structure of Deep Sparse Graphical Models
Huszár F
(2011)
A Kernel Approach to Tractable Bayesian Nonparametrics
Lacoste-Julien S
(2013)
SIGMa
R Silva
(2008)
Hidden Common Cause Relations in Relational Learning
S Lacoste-Julien
(2011)
Approximate inference for the loss-calibrated Bayesian
Silva R
(2010)
RANKING RELATIONS USING ANALOGIES IN BIOLOGICAL AND INFORMATION NETWORKS.
in The annals of applied statistics
Silva R.
(2009)
Factorial mixture of Gaussians and the marginal independence model
in Journal of Machine Learning Research
Silva R.
(2009)
Hidden common cause relations in relational learning
in Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference
Description | Data come under the form of objects and relationships: for instance, a library consists of books that cite each other; proteins bind to other proteins according to a variety of patterns; a network of customers is formed by people that indicate which other customers are trusted reviewers. Such relationships can be used to the predict the behavior and properties of each object. For instance, if a particular news article cites several sport articles, this is evidence that the particular article is likely to be about sports. We have developed novel ways of exploring this relational information. The first task is precisely how to predict the properties of an object (e.g., the class of a news article) based on other objects that that share a relationship with it (e.g., the other articles that are cited by or cite our target). We showed that there are important forms of relationship that are not properly treated by current methods, and developed a new methodology to account for such relations. The second task is how to measure similarity of relational structures. For instance, if two proteins are physically interacting inside a yeast cell, which other pairs of proteins are linked in a similar way? We showed which probabilistic models correspond to this question, and developed novel ways of discovering patterns in relational data with applications in molecular biology. We also explored aspects of causal learning, and how to combine large databases of relational data. |
Exploitation Route | Relational data are ubiquitous. Our methods can be used in many areas of Data Science to understand and predict the relations between objects. |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | Due to the widespread availability of relational data, our work can be directly used in a variety of domains. For instance, companies that want to automatically generate metadocuments based on classifying groups of text files (e.g., the pages generated automatically by Google News) will benefit from a new approach to classify relational objects: in their case, objects are text documents, and relationships are citations or hyperlinks between documents. Biologists that want to unveil new patterns of protein-protein interactions will benefit from new tools that measure similarity of relational structures. Moreover, our work has also had a direct impact in theoretical machine learning. We developed new families of graphical models and inference algorithms which solve problems that cannot be treated with current machine learning methods. |
First Year Of Impact | 2012 |
Sector | Digital/Communication/Information Technologies (including Software),Healthcare |
Impact Types | Economic |
Description | |
Amount | £55,000 (GBP) |
Funding ID | Google Research Award |
Organisation | |
Sector | Private |
Country | United States |
Start | 05/2009 |
Description | Microsoft |
Amount | £66,000 (GBP) |
Funding ID | PhD Scholarship |
Organisation | Microsoft Research |
Sector | Private |
Country | Global |
Start |
Description | Microsoft |
Amount | £83,600 (GBP) |
Funding ID | Award |
Organisation | Microsoft Research |
Sector | Private |
Country | Global |
Start |