Fast, Locally Adaptive Inference for Machine Learning in Graphical Models

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

Graphical models are a powerful tool in machine learning with successful applications in diverse areas such as medical diagnosis, natural language processing, robotics, speech recognition and analysis of genetic data. Despite this success, modern data sets place new demands on the graphical modelling framework, because the models can be enormous, but exact inference in graphical models is intractable. Despite the extensive literature on approximate inference, there is still a huge gap between the largest data sets that we wish to analyse and the largest graphical models that we can handle.

In order to meet the challenges of these new applications, this project concerns new approximate inference algorithms for the large-scale graphical models that arise in practical applications of machine learning. Very few existing inference algorithms can handle extremely large models with continuous variables, and important classes of inference algorithms, such as Monte Carlo techniques, have not been scaled to such models at all. Computationally efficient inference would significantly expand the range of applications to which the graphical modelling framework can be applied.

Planned Impact

The work in this proposal has the potential for substantial economic benefits, because better inference algorithms will enable the graphical modelling framework to be applied with less effort to a broader variety of problems. Graphical modelling provides a powerful, principled reasoning framework that has proven successful for problems in which there is noisy data that only indirectly indicates the true variables of interest. Clearly, this describes a broad variety of applications, both in academia and industry, and for this reason the range of successful applications of graphical modelling has been impressive---including mobile robotics, computer vision, social network analysis, managing online advertisements, and automatically extracting information from product reviews. Despite this success, we believe that graphical modelling has even broader potential for explosive success throughout industry, but this potential cannot be realised without fundamental research into scalable inference algorithms.

The problem is that graphical models are more difficult to apply than they should be. An external user canot really be expected to apply general graphical models without either collaborating with an expert in graphical models or becoming an expert themselves. Although the graphical modelling approach has high practical usefulness, unfortunately the expertise needed to apply it is also high. New, scalable inference inference algorithms have the potential to significant reduce the burden of applying the graphical modelling approach, enabling new industrial applications.

Publications

10 25 50

publication icon
Castellà Q (2014) Word storms

publication icon
Zhang, Y (2012) Continuous Relaxations for Discrete Hamiltonian Monte Carlo in Advances in Neural Information Processing Systems (NIPS) 2012

publication icon
Zhang, Y (2011) Quasi-Newton Markov chain Monte Carlo in Advances in Neural Information Processing Systems (NIPS) 2011

 
Description We have developed new methods to make Hamiltonian Monte Carlo more scalable to the complex high dimensional problems that arise in modern data analysis, and more easily applicable to modern applications, including Bayesian hierarchical models and discrete models. These methods include applying Quasi-Newton approximations from the optimization literature into HMC (NIPS 2011), applying the Hubbard-Stratonovich transform to apply HMC within discrete models (NIPS 2012), and introducing a new semi-separability criterion to dramatically improve the performance of so-called geometric methods for HMC (NIPS 2014, to appear).

This has resulted in publications in the most prestigious international conferences in machine learning --- three publications in the Neural Information Processing Systems conference, which typically has a 20% acceptance rate. One of our publications (NIPS 2012) was selected for a spotlight presentation at the conference --- only 5% of submissions to the conference were recognized in this way.

Additionally, we explored new potential applications for approximate inference within latent variable graphical models, in particular, text visualization. We introduce a new visualisation, called word storms, that adapts the popular method of word clouds to corpora of related documents rather than single documents. This application raises computational issues for large corpora which have the potential for future applications of approximate inference methods.
Exploitation Route MCMC methods are useful for a wide variety of statistical data analysis, ranging from artificial intelligence to political science. The methodology that we have developed is useful well beyond the range of models that were used to initially demonstrate our work.
Sectors Digital/Communication/Information Technologies (including Software),Energy,Healthcare,Retail,Transport

URL http://homepages.inf.ed.ac.uk/csutton/
 
Description There is an increasing amount of interest in Hamiltonian Monte Carlo methods as a way of performing approximate inference in large models across a wide range of application areas. For example, the STAN system from Columbia University, which is becoming increasingly popular within social science, uses of HMC during inference, raising exciting possibilities for impact beyond the time span of the project. All of the code from our project is freely available on the internet so that others can build on our methods and apply them in practice.
First Year Of Impact 2014
Sector Other