Structured Sparsity Methods in Machine Learning an Convex Optimisation

Lead Research Organisation: University of Oxford
Department Name: Mathematical Institute

Abstract

Over the past ten years theoretical developments in machine learning (ML) have had a significant impact in statistics, applied mathematics and other fields of scientific research. In particular, fruitful interactions between ML and numerical optimisation have emerged that are expected to lead to theoretical and algorithmic breakthroughs with the potential to render ML methodologies significantly more applicable to many problems of practical importance. The proposed project aims to make significant UK contributions at a crucial juncture in this emerging interdisciplinary field that has so far been dominated by the US and France. Many ML techniques can be cast as problems of minimising an objective function over a large set of parameters. Examples include support vector machines as well as more recent techniques for semi-supervised learning and multi-task learning. Often the objective function is convex. Consequently, ideas from convex optimisation are becoming increasingly important in the design, implementation and analysis of learning algorithms. Up to now, however, ML has almost exclusively resorted to off the shelf methods for convex optimisation, without substantially exploiting the rich theory which lies behind this field. A thesis of this proposal is that there is a need for a deeper interplay between ML and numerical optimisation. Ultimately, bridging the two communities will facilitate communication and the power of core optimisation will be more easily brought to bear in ML and lead to new frontiers in optimisation. An area in which the interplay between ML and optimisation has a particularly important role to play is in the use of sparsity inducing optimisation problems. A rationale that drives the use of sparsity-inducing models is the observation that when the number of model parameters is much larger than the number of observations, a sparse choice of parameters is strongly desirable for fast and accurate learning. Building on this success, we believe that the time is now right for the development of a new line of algorithms for matrix learning problems under structured sparsity constraints. This means that many of the components of the parameter matrix or a decomposition thereof are zero in locations that are related via some rule (e.g the matrix may be constrained to have many zero rows, many zero eigenvalues, to have sparse eigenvectors, etc.).Perhaps the most well-know examples in which structured sparsity has proven beneficial are in collaborative filtering, where the objective function is chosen to favour low rank matrices, and in multi-task learning where the objective function is chosen to favour few common relevant variables across different regression equations. These types of optimisation problems have only recently started to be addressed in ML and optimisation, and several fundamental problems remain open, most importantly the study of efficient algorithms which exploit the underlying sparsity assumptions and a statistical learning analysis of the methods.Our proposal is multidisciplinary and involves substantial exchange of ideas between Computer Science (Machine Learning) and Mathematics (Numerical Optimisation), with three main goals. Firstly, we aim to develop novel and efficient algorithms for learning large structured matrices; fast convergence of the algorithms should be guaranteed when applied to problem data that have a sparse solution. Secondly, in the cases where the assumed sparsity structure leads to NP-hard problems and the first goal is unachievable (this is often the case under low-rank assumptions), we aim to identify tractable convex relaxations and understand their impact on sparsity. Thirdly, we aim for models and algorithms that have a more natural interpretation than generic solvers (e.g., a minimax statistical justification), which should make it more likely that practitioners will embrace the new methodology.
 
Description Many optimisation algorithms in machine learning contain approximate singular value decompositions as a bottleneck computation. One of our focusses is therefore the development of highly scalable, loosely coupled, communication poor parallel algorithms for computing an approximate leading part singular value decomposition of large scale matrices. We identified a class of such algorithms and analysed their convergence analysis.

Another focus is on applying machine learning techniques to the design of statistical significance tests of optimal sequence alignments by taking the empirical distribution of aligned letter pairs into account. So far we made important inroads into this question by proving that the asymptotic empirical distribution is unique for almost all scoring functions, and by developing an efficient Monte Carlo technique for determining the order of fluctuation. This analysis exhibits a deep connection between convex analysis and large deviations theory.

In a third focus point, we investigated optimisation problems over measure spaces in which the optimal measure has a sparse representation as a sum of simple functions. We identified a similar problem structure in a medical imaging problem (3D X-ray imaging with a microemitter array) and in a risk management context (aggregation of risks under marginal constraints). We developed specialised models and algorithms to be able to solve such problems in high dimensions. This research continues.

In a fourth focus, we are investigating a game theoretic approach to explaining the energy mix and prices in deregulated electricity markets. To do this, a thorough understanding of the market mechanisms and physical properties of power plants and the grid are needed to model the structure of the optimisation problems solved by the different market participants. We successfully set up a model that is able to explain price spikes and the prevailing power mix under consideration of startup costs, trading costs and risk. Further refinements of the model will allow us to study the impact on price stability of adding renewables to the energy mix and could be used by policy makers.
Exploitation Route Data mining in big data applications (large-scale principle component analysis on dense matrices). We currently carry out further research to clarify fundamental questions of sample sizes required to accurately estimate singular vectors in PCA (with Henry Matzinger and Ionel Popescu at Georgia Tech, and with Juri Lember and Raul Kangro at Tartu University), which can be exploited in conjunction with our convergence analysis for our distributed PCA algorithm.

Step change in medical imaging (highly portable and affordable X-ray systems based on microemitter arrays). This is the subject of an ongoing research collaboration with Adaptix Ltd that has already led to the filing of 3 patents and 2 publications. A product using the technology will be launched in late 2016.

Improved statistical tests for the homology of finite sequences (applied in genetics, forensics, natural language processing).

Improved risk management in the banking and insurance sector (through robust bounds on the capital at risk through their balance sheet).
Sectors Aerospace, Defence and Marine,Creative Economy,Digital/Communication/Information Technologies (including Software),Energy,Environment,Financial Services, and Management Consultancy,Healthcare,Transport

URL http://www.maths.ox.ac.uk/people/raphael.hauser
 
Description Adaptix Ltd. is taking off as a successful UK startup company, building microemitter arrays that will use an imaging methodology that we pioneered under the project. In further collaboration with the company, we secured two follow on grants for collaborative research, leading to 2 publications and the filing of 3 US patents in medical imaging. A prototype device has been built and is under further development, and the company are working toward a market launch of a dental imaging system in late 2017. If market entry is successful, the novel device will bring low cost 3D X-ray tomography to health services globally. The Operational Risk Management Team at Banco Santander are looking into the use of our robust risk aggregation model and algorithm under their library of methodologies used to compute the capital charge on their balance sheet.
First Year Of Impact 2013
Sector Financial Services, and Management Consultancy,Healthcare
Impact Types Societal,Economic

 
Description Alan Turing Institute Fellowship
Amount £15,000 (GBP)
Organisation Alan Turing Institute 
Sector Academic/University
Country United Kingdom
Start 09/2016 
End 08/2018
 
Description Impact Acceleration Grant
Amount £46,525 (GBP)
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 04/2016 
End 03/2017
 
Description Radius Health - EPSRC Platform Grant 5
Amount £72,000 (GBP)
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2015 
End 03/2016
 
Title Alignment Microstructure 
Description A method for extracting a vector-valued characterisation of the micostructure of the optimal alignment of two finite sequences relative to a fixed scoring function, in a format that can be used by a kernel learning method to reduce the proportion of false positives in statistical tests on sequence homology based on the above said scoring function. 
Type Of Material Computer model/algorithm 
Year Produced 2013 
Provided To Others? Yes  
Impact Interest and feedback from other researchers in the field. 
 
Title Delayed Column Generation in Risk Management 
Description A model for risk aggregation in high-dimensions and an algorithm for solving this model to obtain robust bounds on the capital at risk for an organisation that has this portfolio of risks on their books. 
Type Of Material Computer model/algorithm 
Year Produced 2014 
Provided To Others? Yes  
Impact I've been contacted by the Head of Operational Risk at Banco Santander. A meeting was set up for December 2014, allowing for knowledge transfer and the discussion of a future partnership arrangment, likely to involve the sharing of data. 
 
Title Distributed SVD algorithm 
Description A method for computing a principle component analysis on very large data sets on a parallel computer or a network of computers with heterogenous computational resources. 
Type Of Material Computer model/algorithm 
Year Produced 2012 
Provided To Others? Yes  
Impact So far the dissemination took place through conference and seminar talks. We will publish the source code in early 2015. 
 
Description Microstructure of optimal sequence alignments 
Organisation Georgia Institute of Technology
Country United States 
Sector Academic/University 
PI Contribution Generating ideas, proving theorems, writing journal publications. The goal of this research is to exploit the microstructure of optimal sequence alignments in conjunction with machine learning to avoid (or reduce the number of) false positives in homology tests.
Collaborator Contribution Generating ideas, proving theorems, writing journal publications.
Impact Journal publications, technical reports.
Start Year 2011
 
Description Structured Sparsity in Optimisation and Machine Learning 
Organisation University College London
Country United Kingdom 
Sector Academic/University 
PI Contribution Exchange of ideas in research discussions, writing of papers and dissemination to the wider public.
Collaborator Contribution Exchange of ideas in research discussions, writing of papers and dissemination to the wider public.
Impact Dissemination of project outcomes to the general public in a form that is understandable independent of mathematical knowledge (Internation Innovation feature article).
Start Year 2010
 
Description X-Ray Imaging with Microemitter Arrays 
Organisation Adaptix Ltd
Country United Kingdom 
Sector Private 
PI Contribution Pilot study with an MSc students into the use of sparse optimisation to construct a 3D medical image of measurements taken under X-ray exposure from multiple sources on a microemitter array (MAX). Following this pilot project and a name change of the industrial partner from Radius Health Ltd to Adaptix Ltd, I drafted two patents which were filed by Adaptix, and they in turn financed a one-year postdoctoral research project. This project has lead to 2 publications and a further patent, as well as a further year of follow-on funding.
Collaborator Contribution They own the expertise, technology and associated IP for fabricating the MAX.
Impact 3 patent filings, further funding for a postdoc, 2 publications
Start Year 2012
 
Description X-Ray Imaging with Microemitter Arrays 
Organisation Radius Health
Country United States 
Sector Private 
PI Contribution Pilot study with an MSc students into the use of sparse optimisation to construct a 3D medical image of measurements taken under X-ray exposure from multiple sources on a microemitter array (MAX). Following this pilot project and a name change of the industrial partner from Radius Health Ltd to Adaptix Ltd, I drafted two patents which were filed by Adaptix, and they in turn financed a one-year postdoctoral research project. This project has lead to 2 publications and a further patent, as well as a further year of follow-on funding.
Collaborator Contribution They own the expertise, technology and associated IP for fabricating the MAX.
Impact 3 patent filings, further funding for a postdoc, 2 publications
Start Year 2012
 
Title A medical imaging system having a single fixed array of X-ray detectors and a single fixed array of X-ray emitters for producing a digital 3-dimensional image 
Description A 3D image reconstruction method (physical apparatus based on X-ray source panel, mathematical model, algorithm, artefact removal techniques and data structure) for X-ray imaging systems with multiple sources arranged on a flat panel, illuminating an object that is to be imaged under different angles. 
IP Reference WO2017130013 
Protection Patent application published
Year Protection Granted 2015
Licensed Commercial In Confidence
Impact The methodology is used by Adaptix Ltd., an Oxfordshire based medical imaging startup, in their dental prototype imaging system and will go to market in late 2017.
 
Title A method of designing an x-ray emitter panel 
Description An optimal geometry for an X-ray emission panel (multiple X-ray sources arranged on a flat panel) under consideration of 3D image reconstruction based on sparse optimisation (nonlinear compressed sensing model for X-ray measurements with overlap). 
IP Reference WO2016059535 
Protection Patent granted
Year Protection Granted 2016
Licensed Commercial In Confidence
Impact Adaptix Ltd., an Oxfordshire based startup who develop X-ray source panels, have used the method in their prototype dental imaging system that will go to market in late 2017.
 
Title 3D dental imaging system 
Description My work on 3D imaging for X-ray source panel imaging systems has been incorporated into the prototype dental imaging system that Adaptix Ltd. are planning to bring to market in late 2017. This system provides 3D imaging without moving parts and fills an unmet need for 3D scans of lower cost and resolution than full CT scanning. If the product launch is successful, the technology will be rolled out to other medical imaging applications, notably trauma, TB diagnosis, and radiotherapy. 
Type Diagnostic Tool - Imaging
Current Stage Of Development Refinement. Clinical
Year Development Stage Completed 2017
Development Status Under active development/distribution
Impact The radiology research group of Prof. Frank Van der Heuvel at Oxford University are interested in using the technology in a radiotherapy device that they develop, which takes intermittent 3D images of a tumour to optimise the delivery of therapeutic radiation. We are in discussions to submit a joint grant proposal. 
 
Description Feature article for general audience 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Feature article in Internation Innovation, a magazine aimed at policy makers, industrial leaders and the wider public, to showcase European research.

Email contact by journalists.
Year(s) Of Engagement Activity 2014