Accelerated Coordinate Descent Methods for Big Data Problems

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Mathematics

Abstract

Much of modern society and economy, in the United Kingdom and elsewhere, is moving in the direction of digitization and computation. Humankind is now able to collect and store enormous quantities of digital data coming from sources such as health records (e.g., IBM ``Watson'' project, MRI/CT scans), government databases (e.g., e-Government, GORS: government operational research service), social networks (e.g., Facebook, Linked-IN, delicious), online news (e.g., New York Times article database), corporate databases (e.g., bank records, Amazon.com) and the internet. Global society is, as a consequence, facing many unprecedented challenges and opportunities. One of the biggest of these has to do with the ability (or rather, lack thereof) to distill, understand and utilize in an optimal way the information contained within these gigantic data sources. The main technology for this is to "form an optimization problem'' and then solve it using a well-chosen optimization algorithm in a suitable computing environment (e.g., a multicore workstation, GPU-enabled machine, cloud).

In this project we aim to contribute to a breakthrough in our ability to solve optimization problems arising from big data domains via developing, analyzing and implementing new accelerated parallel coordinate descent (CD) methods. Since in big data problems the data is typically highly structured, well-designed CD methods can have very low memory requirements and arithmetic cost per iteration---often much smaller than the dimension of the problem. This is in sharp contrast with standard methods whose arithmetic complexity of a single iteration depends on the dimension at least quadratically.

Our research objectives are:

1. Acceleration Theory. We will analyze the iteration complexity (i.e., give bounds on the number of iterations/steps needed to achieve a prescribed level of accuracy) of new parallel coordinate descent methods accelerated using the following 4 strategies: a) nonuniformity (of the frequency with which individual coordinates are updated), b) asynchronicity (of updates and computation), c) distribution (of data and computation to nodes of a cluster) and d) inexactness (of certain operations and computations the algorithm depends on).

2. Stochastic Gradient Descent. We will analyze theoretically and test numerically the relationship between parallel coordinate descent (CD) methods and parallel stochastic gradient descent (SGD) methods.

3. ACDC Code. We will implement the accelerated algorithms in a code which we will make publicly available.

Planned Impact

The pathways to impact document details the ways in which we plan to achieve non-academic impact. The Case for Support details the academic impact.

Here we offer a brief summary:

Academic Impact:
- close research communities: optimization, operational research, numerical analysis, computer science, machine learning, compressed sensing, high performance computing, ITC
- more distant research communities: biology, astronomy, civil engineering, topology design, signal processing
- individuals: Prof Nesterov (Louvain), Prof Srebro (Chicago), Martin Takac (Edinburgh), PDRA, Prof Gondzio (Edinburgh), Prof Recht (Madison), Prof Wright (Madison), Dr Forgan (Astronomy)

Non-academic Impact:
- ACDC code (publicly available efficient code based on the algorithms developed in Module 1)
- SAS Institute (Dr Polik, Dr Chari, Dr Griffin)
- Western General Hospital (Dr Green)
- Arup (Mr Simpson)

Pathways:
- writing papers in top journals
- presenting at top conferences and workshops
- website with code at code.google.com + links to it from various places
- meetings with and feedback from SAS Institute at conferences (at least 1-2 x per year) and via teleconferencing facilities
- visit to SAS Headquarters in Cary, North Carolina
- impact workshop (to be held in May 2015)
- supervision of 2 MSc dissertations (Summer 2013 and Summer 2014)

Since structured convex optimization, randomized algorithms and data mining are important and fast growing fields, this proposal is timely for establishing and growing an internationally competitive research group in this area based in the UK.

Quote from EPSRC website: "For UK research to have maximum impact, our researchers need to work with the very best on the global stage." We strongly believe that, given the quality of the network of researchers involved and, in particular, of the visiting researchers, this proposal falls into this category.
 
Description Many new algorithms for big data optimization.
Exploitation Route The algorithms I developed underpin modern information based society. They are of fundamental nature, and applicable in virtually all domains of data science.
Sectors Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Construction,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Financial Services, and Management Consultancy,Healthcare,Leisure Activities, including Sports, Recreation and Tourism,Manufacturing, including Industrial Biotechology,Transport

URL http://www.maths.ed.ac.uk/~prichtar/i_papers.html
 
Description Several of my algorithms are now adopted by industry.
First Year Of Impact 2013
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Societal,Economic

 
Description EPSRC Early Career Fellowship
Amount £823,211 (GBP)
Funding ID EP/N005538/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2016 
End 12/2020
 
Title CoCoA 
Description A framework for communication-efficient distributed optimization for machine learning. 
Type Of Technology Software 
Year Produced 2015 
Impact The CoCoA [NIPS 2014] / CoCoA+ [ICML 2015] distributed optimization algorithm developed in a duo of papers with two co-authors from Edinburgh (Martin Takác, myself) has won the MLconf Industry Impact Student Research Award. The award goes to our coauthor Virginia Smith (UC Berkeley). Other co-authors: M. Jaggi (ETH Zurich), M.I. Jordan (Berkeley), C. Ma (Lehigh), J. Terhorst (UC Berkeley), S. Krishnan (UC Berkeley), T. Hofmann (ETH Zurich). About the award: "This year, we started a new award program called the MLconf Industry Impact Student Research Award, which is sponsored by Google. This fall, our committee of distinguished ML professionals reviewed several nominations sent in from members of the MLconf community. There were several great researchers that were nominated and the committee arrived at awarding 2 students whose work, they believe, has the potential to disrupt the industry in the future. The two winners that were announced at MLconf SF 2015 are UC Irvine Student, Furong Huang and UC Berkeley Student, Virginia Smith. Below are summaries of their research. We've invited both researchers to present their work at upcoming MLconf events." The citation: " Virginia Smith's research focuses on distributed optimization for large-scale machine learning. The main challenge in many large-scale machine learning tasks is to solve an optimization objective involving data that is distributed across multiple machines. In this setting, optimization methods that work well on single machines must be re-designed to leverage parallel computation while reducing communication costs. This requires developing new distributed optimization methods with both competitive practical performance and strong theoretical convergence guarantees. Virginia's work aims to determine policies for distributed computation that meet these requirements, in particular through the development of a novel primal-dual framework, CoCoA, which is written on Spark. The theoretical and practical development of CoCoA is an important step for future data scientists hoping to deploy efficient large-scale machine learning algorithms." 
URL https://github.com/gingsmith/cocoa
 
Description Talks 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact talks at workshops, seminars, conferences and so on; more than 15 a year
Year(s) Of Engagement Activity 2015,2016,2017
 
Description youtube talks 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Research talks posted on youtube. Sparked interest in the developed methods in various audiences.
Year(s) Of Engagement Activity 2013,2014,2015
URL https://www.youtube.com/watch?v=0sHOfqhCZw0