On robustness and adaptivity properties of mirror descent in high-dimension

Lead Research Organisation: University of Oxford

Abstract

The problem of minimising an objective function naturally arises in the vast majority of statistical learning problems. There is a rich plethora of optimisation methods for solving this task, many of them with strong theoretical and empirical guarantees. One of the simplest and most well-known such methods is gradient descent, along with its stochastic variants and the generalized mirror descent. However, the scenarios commonly studied in theory involve non-contaminated training data. Given the enormous amounts of data learning algorithms have to process nowadays, one would expect non-negligible contaminations that would have an impact on the quality of the predictors. Learning in the presence of outliers can be done, for example, by using some filtration method prior to the learning phase. This often turns out to be computationally intensive and rather difficult to study, as different types of outliers require different approaches. If the optimisation algorithm would have the same effect as this pre-processing step, one might expect significant computational savings and a unified approach to dealing with such problems.

The main goal of this project is to understand if mirror descent is robust to 'outliers', and if it possibly adapts to different types of contaminations in the data. The questions this project tries to answer are the following:
- Is there a unified approach to studying statistical learning problems that involve different types of data contamination? What amount of contamination can provably cause any learning algorithm to have poor predictive guarantees?
- How is the choice of the mirror map (and hence of the induced geometry) affecting the robustness properties of the mirror descent algorithm? Is there a choice of the mirror map that guarantees the algorithm is adaptive to different types of data contamination?
- To what extent can these potential findings be applied in order to explain the robustness properties observed empirically in different models, e.g. deep neural networks?

The work on statistical robustness, started in the previous century by Huber, mostly focuses on robust estimation, rather than robust prediction. Techniques used in this area of research offer a potential starting point to approach the above-mentioned questions, along with literature on high-dimensional statistics, mirror descent, and online learning (for example, as in the 'best of both worlds' problem). Our hope is that by combining different methods that are popular in the aforementioned areas, the final result of this project will be a novel theoretical understanding of robust learning, analogous to the classical learning theory framework.

This project falls within the EPSRC Statistics and applied probability and EPSRC Theoretical computer science research areas.

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2564812 Studentship EP/S023151/1 01/10/2021 30/09/2025 Alex Buna Marginean