A Non-Asymptotic Analysis of Stochastic Mirror Descent for Non-Convex Learning

Lead Research Organisation: University of Oxford

Department Name: Statistics

Abstract

Many state of the art machine learning techniques depend heavily on the optimisation of some non-convex objective. Often, this procedure is computationally expensive and requires techniques that employ stochastic approximations or regularisation using artificial noise. For instance, there are a variety of techniques in deep learning that use artificial noise applied to the data, parameters or update direction, all of which have been shown to encourage faster convergence and improved generalisation. Though many of these techniques have demonstrated their validity through repeated experimentation and testing, the theoretical framework to validate and understand them is still in its infancy.

Another technique which is used to overcome the aforementioned computational difficulties is the mirror descent framework, which does so by taking advantage of the known geometric properties specific to the problem in consideration. It is highly popular in convex optimisation as in this setting there are many provable advantages to this algorithm. Its stochastic counterpart, stochastic mirror descent, is rapidly gaining popularity in non-convex learning. But again, the theoretical framework in this setting is under-developed.

Through our project, we seek to develop a better understanding of the statistical and computational properties of these techniques in the large-scale learning setting. Specifically, we are interested in quantitively comparing the consistency and generalisation performance of these techniques and seeing how they change as the problem grows more challenging.

Guiding much of our methodology will be the focus on non-asymptotic analysis, that is compared to analyses in which quantities are taken to their infinite limits. We believe this is essential for applications in large-scale learning where only a limited number of iterations can take place and the effect of the number of parameters on performance is of key interest.

Fundamentally, our approach will be based on approximating the iterative optimisation procedure with stochastic processes that evolve continuously in time, specifically diffusion processes. With this, we gain access to the broad suite of powerful techniques for analysing diffusions. This approach has seen a growing popularity in recent years and has already yielded interesting results in the learning setting. The project will also draw heavily from the study of stochastic differential equations as well as high dimensional statistics.

This project falls within the EPSRC Statistics and Applied probability area.

Student:

Tyler Farghly

Period of Study:

Sep 20 - Mar 25

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2444063

Research Topic:

Unclassified

Organisations

University of Oxford (Lead Research Organisation)

People	ORCID iD
Tyler Farghly (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/R513295/1			30/09/2018	29/09/2023
2444063	Studentship	EP/R513295/1	30/09/2020	30/03/2025	Tyler Farghly
EP/T517811/1			30/09/2020	29/09/2025
2444063	Studentship	EP/T517811/1	30/09/2020	30/03/2025	Tyler Farghly

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects