DMS-EPSRC: Fast martingales, large deviations and randomised gradients for heavy-tailed target distributions
Lead Research Organisation:
University of Warwick
Department Name: Statistics
Abstract
Markov chain is a mathematical object representing a random evolution with the following property: if we know the present state of the chain, its past and future are independent (i.e. information about the past does not alter the distribution of its future states). Markov chain models are fundamental across sciences and engineering. At the centre of this project are Markov chains on multi-dimensional state spaces that arise in randomised algorithms used in statistics and machine learning. This proposal is focused on the theoretical analysis of chains arising in applications in the case when their limiting distribution has heavy tails. The analysis of the heavy-tailed phenomena is crucial for the future success of randomised algorithms for two reasons: (a) they arise naturally in many applied problems and (b) are least well understood as they violate standard assumptions made in the existing theory (e.g. asymptotic linearity of the potential of the limit distribution at infinity).
(a) Heavy-tailed limiting distributions arise naturally in many applications. For example, if the errors in a regression model are distributed according to a Cauchy distribution, the posterior density has polynomial tails. Perhaps a more startling fact is that heavy tails can arise in the posterior even though a heavy-tailed distribution does not appear in the definition of a model. If the errors in a data set are heteroscedastic, meaning that the variance of the error term varies with each observation, it is necessary to use the so-called robust regression (based on e.g. Lasso-type penalisation) in order to reduce the effect of the outliers. Again the posterior has heavy tails.
(b) The presence of a spectral gap is known to be equivalent to geometric convergence of a Markov chain. However, as pointed out recently in the queueing literature, under geometric convergence ergodic estimators may still exhibit large deviation behaviour of the heavy-tailed type. Conversely, Markov chains with heavy tailed stationary measures typically do not have a spectral gap but might nevertheless exhibit good convergence properties. The EPSRC-NSF Lead Agency agreement presents a unique opportunity to combine the US expertise in theoretical Operations Research with the UK's capability in Computational Statistics, resulting in novel methodology for the analysis of the convergence of Markov chains with heavy-tailed targets, the main focus for this project.
Our main goal is to fill the gap in the literature, best illustrated by the following baseline algorithm from applications: a random-scan Metropolis-within-Gibbs chain picks randomly a coordinate of a target distribution and moves it by a one-dimensional Metropolis step based on the conditional of the target. It is possible to prove that if ANY one-dimensional marginal of the target has heavy tails, the random-scan chain is NOT geometrically ergodic. The main goal of this proposal is to lay the theoretical foundations for the analysis of the stability of Markov chains with heavy-tailed targets, focusing on the processes that underpin many randomised algorithms used in practice. In time, this work is expected to have impact far beyond applied probability in a number of sub-areas of computational statistics and machine learning where heavy-tailed targets arise.
(a) Heavy-tailed limiting distributions arise naturally in many applications. For example, if the errors in a regression model are distributed according to a Cauchy distribution, the posterior density has polynomial tails. Perhaps a more startling fact is that heavy tails can arise in the posterior even though a heavy-tailed distribution does not appear in the definition of a model. If the errors in a data set are heteroscedastic, meaning that the variance of the error term varies with each observation, it is necessary to use the so-called robust regression (based on e.g. Lasso-type penalisation) in order to reduce the effect of the outliers. Again the posterior has heavy tails.
(b) The presence of a spectral gap is known to be equivalent to geometric convergence of a Markov chain. However, as pointed out recently in the queueing literature, under geometric convergence ergodic estimators may still exhibit large deviation behaviour of the heavy-tailed type. Conversely, Markov chains with heavy tailed stationary measures typically do not have a spectral gap but might nevertheless exhibit good convergence properties. The EPSRC-NSF Lead Agency agreement presents a unique opportunity to combine the US expertise in theoretical Operations Research with the UK's capability in Computational Statistics, resulting in novel methodology for the analysis of the convergence of Markov chains with heavy-tailed targets, the main focus for this project.
Our main goal is to fill the gap in the literature, best illustrated by the following baseline algorithm from applications: a random-scan Metropolis-within-Gibbs chain picks randomly a coordinate of a target distribution and moves it by a one-dimensional Metropolis step based on the conditional of the target. It is possible to prove that if ANY one-dimensional marginal of the target has heavy tails, the random-scan chain is NOT geometrically ergodic. The main goal of this proposal is to lay the theoretical foundations for the analysis of the stability of Markov chains with heavy-tailed targets, focusing on the processes that underpin many randomised algorithms used in practice. In time, this work is expected to have impact far beyond applied probability in a number of sub-areas of computational statistics and machine learning where heavy-tailed targets arise.
Publications
Yang Jun
(2022)
Stereographic Markov Chain Monte Carlo
in arXiv e-prints
Wilfrid S. Kendall
(2022)
Optimal Markovian coupling for finite activity Lévy processes
in Bernoulli
Vasdekis Giorgos
(2021)
Speed Up Zig-Zag
in arXiv e-prints
Ramírez M
(2024)
The sticky Lévy process as a solution to a time change equation
in Journal of Mathematical Analysis and Applications
Mijatovic A
(2024)
Stationary entrance chains and applications to random walks
Mijatovic A
(2022)
Limit theorems for local times and applications to SDEs with jumps
in Stochastic Processes and their Applications
Menshikov M
(2023)
Reflecting Brownian motion in generalized parabolic domains: Explosion and superdiffusivity
in Annales de l'Institut Henri Poincaré, Probabilités et Statistiques
Jorge González Cázares
(2023)
Joint density of a stable process and its supremum: regularity and upper bounds
in Bernoulli
González Cázares J
(2022)
Convex minorants and the fluctuation theory of Lévy processes
in Latin American Journal of Probability and Mathematical Statistics
Description | New research directions have been identified and significant progress made. 1. Levy processes (LPs) are arguably the most fundamental class of stochastic processes with heavy tails as they are continuous-time analogues of random walks. LPs were introduced in the first half of the twentieth century. They have been extensively studied since and have played a major role in many application areas in the sciences and beyond. In this project we made a surprising discovery of a new representation of the law of a Levy process, based on a probabilistic structure (called the "stick-breaking construction"). This new way of looking at LPs greatly simplified the classical theory, yielding much more directly deep results that previously required 150 pages of a monograph to establish. It also enabled us to prove new results in this classical field. Our main motivation was simulation of certain path statistics of Levy processes with heavy tails, that arise in applications, and it was precisely this angle, which is the focus of this grant, that made these advances possible. 2. High-dimensional heavy-tailed invariant measures and rates of convergence. The convergence of Markov processes with high-dimensional stationary distributions, especially those with heavy tails, is known to be slow in practice and notoriously difficult to analyse. We developed a criterion for establishing lower bounds on the convergence rate in such cases. The key feature of this advance is that it is applicable in practice (in the same way the classical theory for upper bounds on convergence are). It is thus expected to be useful in the design of simulation algorithms, as it will for the first time allow a direct comparison of convergence rates, which is not feasible if only upper bounds are available. A further contribution highlight of the project is a new algorithmic approach for MCMC samplers that maps the original high-dimensional problem in Euclidean space onto a sphere thus remedying the notorious mixing problems for heavy-tailed target distributions. The proposed samplers may enjoy the "blessings of dimensionality", where convergence is faster in higher dimensions. |
Exploitation Route | The work in this project, while theoretical in nature, can be taken forward to advance the simulation methods for Levy processes used in practice as well as help provide novel approaches to simulation problems within Machine Learning that require sampling from heavy-tailed target distributions. We are still at a relatively early stage of the theoretical developments and expect to report on the implications of this research in more detail in the years to come. |
Sectors | Digital/Communication/Information Technologies (including Software) Energy Financial Services and Management Consultancy Healthcare Security and Diplomacy |
URL | https://www.youtube.com/@prob-am7844 |
Description | HEAVY TAILS IN MACHINE LEARNING - 1 month INI Satellite Programme on applied probability and machine learning, to take place at The Alan Turing Institute |
Amount | £75,000 (GBP) |
Organisation | Isaac Newton Institute for Mathematical Sciences |
Sector | Academic/University |
Country | United Kingdom |
Start | 05/2024 |
End | 07/2024 |
Description | Prob-AM, a YouTube channel for dissemination of research outputs |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Short YouTube videos aimed at researchers and students (graduate and undergraduate). The main purpose is to extend awareness of my research by lowering the time/effort barrier for understanding and using the research output. |
Year(s) Of Engagement Activity | 2020,2021,2022,2023,2024 |
URL | https://www.youtube.com/channel/UCXSoLS_uKebYZ9GzgAF0ZsA |