Accelerated Coordinate Descent Methods for Big Data Problems

Lead Research Organisation: University of Edinburgh

Department Name: Sch of Mathematics

Abstract

Much of modern society and economy, in the United Kingdom and elsewhere, is moving in the direction of digitization and computation. Humankind is now able to collect and store enormous quantities of digital data coming from sources such as health records (e.g., IBM ``Watson'' project, MRI/CT scans), government databases (e.g., e-Government, GORS: government operational research service), social networks (e.g., Facebook, Linked-IN, delicious), online news (e.g., New York Times article database), corporate databases (e.g., bank records, Amazon.com) and the internet. Global society is, as a consequence, facing many unprecedented challenges and opportunities. One of the biggest of these has to do with the ability (or rather, lack thereof) to distill, understand and utilize in an optimal way the information contained within these gigantic data sources. The main technology for this is to "form an optimization problem'' and then solve it using a well-chosen optimization algorithm in a suitable computing environment (e.g., a multicore workstation, GPU-enabled machine, cloud).

In this project we aim to contribute to a breakthrough in our ability to solve optimization problems arising from big data domains via developing, analyzing and implementing new accelerated parallel coordinate descent (CD) methods. Since in big data problems the data is typically highly structured, well-designed CD methods can have very low memory requirements and arithmetic cost per iteration---often much smaller than the dimension of the problem. This is in sharp contrast with standard methods whose arithmetic complexity of a single iteration depends on the dimension at least quadratically.

Our research objectives are:

1. Acceleration Theory. We will analyze the iteration complexity (i.e., give bounds on the number of iterations/steps needed to achieve a prescribed level of accuracy) of new parallel coordinate descent methods accelerated using the following 4 strategies: a) nonuniformity (of the frequency with which individual coordinates are updated), b) asynchronicity (of updates and computation), c) distribution (of data and computation to nodes of a cluster) and d) inexactness (of certain operations and computations the algorithm depends on).

2. Stochastic Gradient Descent. We will analyze theoretically and test numerically the relationship between parallel coordinate descent (CD) methods and parallel stochastic gradient descent (SGD) methods.

3. ACDC Code. We will implement the accelerated algorithms in a code which we will make publicly available.

Planned Impact

The pathways to impact document details the ways in which we plan to achieve non-academic impact. The Case for Support details the academic impact.

Here we offer a brief summary:

Academic Impact:
- close research communities: optimization, operational research, numerical analysis, computer science, machine learning, compressed sensing, high performance computing, ITC
- more distant research communities: biology, astronomy, civil engineering, topology design, signal processing
- individuals: Prof Nesterov (Louvain), Prof Srebro (Chicago), Martin Takac (Edinburgh), PDRA, Prof Gondzio (Edinburgh), Prof Recht (Madison), Prof Wright (Madison), Dr Forgan (Astronomy)

Non-academic Impact:
- ACDC code (publicly available efficient code based on the algorithms developed in Module 1)
- SAS Institute (Dr Polik, Dr Chari, Dr Griffin)
- Western General Hospital (Dr Green)
- Arup (Mr Simpson)

Pathways:
- writing papers in top journals
- presenting at top conferences and workshops
- website with code at code.google.com + links to it from various places
- meetings with and feedback from SAS Institute at conferences (at least 1-2 x per year) and via teleconferencing facilities
- visit to SAS Headquarters in Cary, North Carolina
- impact workshop (to be held in May 2015)
- supervision of 2 MSc dissertations (Summer 2013 and Summer 2014)

Since structured convex optimization, randomized algorithms and data mining are important and fast growing fields, this proposal is timely for establishing and growing an internationally competitive research group in this area based in the UK.

Quote from EPSRC website: "For UK research to have maximum impact, our researchers need to work with the very best on the global stage." We strongly believe that, given the quality of the network of researchers involved and, in particular, of the visiting researchers, this proposal falls into this category.

Funded Value:

£100,679

Funded Period:

Nov 13 - Jan 16

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/K02325X/1

Principal Investigator:

Peter Richtarik

Research Subject:

Info. & commun. Technol. (25%)

Mathematical sciences (75%)

Research Topic:

Artificial Intelligence (25%)

Mathematical Aspects of OR (40%)

Numerical Analysis (35%)

Organisations

People	ORCID iD
Peter Richtarik (Principal Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 4 5 6 > >|

10 25 50

Allen-Zhu Zeyuan (2015) Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling in arXiv e-prints

Chambolle A (2018) Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications in SIAM Journal on Optimization

Csiba D (2015) Stochastic dual coordinate ascent with adaptive probabilities

Csiba D. (2018) Importance sampling for minibatches in Journal of Machine Learning Research

Csiba Dominik (2015) Primal Method for ERM with Flexible Mini-batching Schemes and Non-convex Losses in arXiv e-prints

Fercoq O (2013) Accelerated, Parallel and Proximal Coordinate Descent

Fercoq O (2015) Accelerated, Parallel, and Proximal Coordinate Descent in SIAM Journal on Optimization

Fercoq O (2016) Restarting accelerated gradient methods with a rough strong convexity estimate

Fercoq O (2019) Adaptive restart of accelerated gradient methods under local quadratic growth condition in IMA Journal of Numerical Analysis

Fercoq O (2016) Optimization in High Dimensions via Accelerated, Parallel, and Proximal Coordinate Descent in SIAM Review

Key Findings
Impact Summary
Further Funding
Software and Technical Products
Engagement Activities


Description	Many new algorithms for big data optimization.
Exploitation Route	The algorithms I developed underpin modern information based society. They are of fundamental nature, and applicable in virtually all domains of data science.
Sectors	Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Construction,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Financial Services, and Management Consultancy,Healthcare,Leisure Activities, including Sports, Recreation and Tourism,Manufacturing, including Industrial Biotechology,Transport
URL	http://www.maths.ed.ac.uk/~prichtar/i_papers.html


Description	Several of my algorithms are now adopted by industry.
First Year Of Impact	2013
Sector	Digital/Communication/Information Technologies (including Software)
Impact Types	Societal,Economic


Description	EPSRC Early Career Fellowship
Amount	£823,211 (GBP)
Funding ID	EP/N005538/1
Organisation	Engineering and Physical Sciences Research Council (EPSRC)
Sector	Public
Country	United Kingdom
Start	01/2016
End	12/2020


Title	CoCoA
Description	A framework for communication-efficient distributed optimization for machine learning.
Type Of Technology	Software
Year Produced	2015
Impact	The CoCoA [NIPS 2014] / CoCoA+ [ICML 2015] distributed optimization algorithm developed in a duo of papers with two co-authors from Edinburgh (Martin Takác, myself) has won the MLconf Industry Impact Student Research Award. The award goes to our coauthor Virginia Smith (UC Berkeley). Other co-authors: M. Jaggi (ETH Zurich), M.I. Jordan (Berkeley), C. Ma (Lehigh), J. Terhorst (UC Berkeley), S. Krishnan (UC Berkeley), T. Hofmann (ETH Zurich). About the award: "This year, we started a new award program called the MLconf Industry Impact Student Research Award, which is sponsored by Google. This fall, our committee of distinguished ML professionals reviewed several nominations sent in from members of the MLconf community. There were several great researchers that were nominated and the committee arrived at awarding 2 students whose work, they believe, has the potential to disrupt the industry in the future. The two winners that were announced at MLconf SF 2015 are UC Irvine Student, Furong Huang and UC Berkeley Student, Virginia Smith. Below are summaries of their research. We've invited both researchers to present their work at upcoming MLconf events." The citation: " Virginia Smith's research focuses on distributed optimization for large-scale machine learning. The main challenge in many large-scale machine learning tasks is to solve an optimization objective involving data that is distributed across multiple machines. In this setting, optimization methods that work well on single machines must be re-designed to leverage parallel computation while reducing communication costs. This requires developing new distributed optimization methods with both competitive practical performance and strong theoretical convergence guarantees. Virginia's work aims to determine policies for distributed computation that meet these requirements, in particular through the development of a novel primal-dual framework, CoCoA, which is written on Spark. The theoretical and practical development of CoCoA is an important step for future data scientists hoping to deploy efficient large-scale machine learning algorithms."
URL	https://github.com/gingsmith/cocoa


Description	Talks
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	talks at workshops, seminars, conferences and so on; more than 15 a year
Year(s) Of Engagement Activity	2015,2016,2017


Description	youtube talks
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Research talks posted on youtube. Sparked interest in the developed methods in various audiences.
Year(s) Of Engagement Activity	2013,2014,2015
URL	https://www.youtube.com/watch?v=0sHOfqhCZw0

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications