Asymptotic analysis of machine learning models and algorithms

Lead Research Organisation: University of Oxford

Abstract

Machine learning has made a substantial impact on many applications across a wide variety of fields. Despite their success, many machine learning and deep learning algorithms remain ad hoc and therefore their mathematical analysis is an interesting and important topic for research. In his PhD thesis, Ziheng Wang is developing rigorous mathematical theory for several machine learning models and algorithms.
His work falls within the EPSRC research area of Artificial Intelligence Technologies and the ESPRC theme of Mathematical Sciences.

In Year 1, Ziheng proved a global convergence result for actor-critic algorithms in reinforcement learning. Actor-critic algorithms are widely used in reinforcement learning, but are challenging to mathematically analyse due to the online arrival of non-i.i.d. data samples. The distribution of the data samples dynamically changes as the model is updated, introducing a complex feedback loop between the data distribution and the reinforcement learning algorithm. We prove that, under a time rescaling, the online actor-critic algorithm with tabular parametrization converges to an ordinary differential equations (ODEs) as the number of updates becomes large. The proof first establishes the geometric ergodicity of the data samples under a fixed actor policy. Then, using a Poisson equation, we prove that the fluctuations of the data samples around a dynamic probability measure, which is a function of the evolving actor model, vanish as the number of updates become large. Once the ODE limit has been derived, we study its convergence properties using a two time-scale analysis which asymptotically de-couples the critic ODE from the actor ODE.

The convergence of the critic to the solution of the Bellman equation and the actor to the optimal policy are proven. In addition, a convergence rate to this global minimum is also established. Our convergence analysis holds under specific choices for the learning rates and exploration rates in the actor-critic algorithm, which could provide guidance for the implementation of actor-critic algorithms in practice. Building upon this Year 1 research, Ziheng is now working on using similar mathematical techniques to prove global convergence for actor-critic algorithms with single-layer neural network function approximators.

In addition, Ziheng is working with Professor Alvaro Cartea on a project in the area of high-frequency trading. The goal is to find a good approach for combining the prediction for the midprice returns of a given asset T1, T2,... TN seconds away to design an efficient strategy to improve the broker's performance in market making and optimal execution. As a first step, we want to improve the performance of the broker by incorporating the signal into optimal execution model. To do this, we set up a new model and solve a game between a broker, a noise trader and an informed trader. We solve the stochastic control problem for the informed trader and broker and get the optimal strategy in closed-form.

Planned Impact

Probabilistic modelling permeates the Financial services, healthcare, technology and other Service industries crucial to the UK's continuing social and economic prosperity, which are major users of stochastic algorithms for data analysis, simulation, systems design and optimisation. There is a major and growing skills shortage of experts in this area, and the success of the UK in addressing this shortage in cross-disciplinary research and industry expertise in computing, analytics and finance will directly impact the international competitiveness of UK companies and the quality of services delivered by government institutions.
By training highly skilled experts equipped to build, analyse and deploy probabilistic models, the CDT in Mathematics of Random Systems will contribute to
- sharpening the UK's research lead in this area and
- meeting the needs of industry across the technology, finance, government and healthcare sectors

MATHEMATICS, THEORETICAL PHYSICS and MATHEMATICAL BIOLOGY

The explosion of novel research areas in stochastic analysis requires the training of young researchers capable of facing the new scientific challenges and maintaining the UK's lead in this area. The partners are at the forefront of many recent developments and ideally positioned to successfully train the next generation of UK scientists for tackling these exciting challenges.
The theory of regularity structures, pioneered by Hairer (Imperial), has generated a ground-breaking approach to singular stochastic partial differential equations (SPDEs) and opened the way to solve longstanding problems in physics of random interface growth and quantum field theory, spearheaded by Hairer's group at Imperial. The theory of rough paths, initiated by TJ Lyons (Oxford), is undergoing a renewal spurred by applications in Data Science and systems control, led by the Oxford group in conjunction with Cass (Imperial). Pathwise methods and infinite dimensional methods in stochastic analysis with applications to robust modelling in finance and control have been developed by both groups.
Applications of probabilistic modelling in population genetics, mathematical ecology and precision healthcare, are active areas in which our groups have recognized expertise.

FINANCIAL SERVICES and GOVERNMENT

The large-scale computerisation of financial markets and retail finance and the advent of massive financial data sets are radically changing the landscape of financial services, requiring new profiles of experts with strong analytical and computing skills as well as familiarity with Big Data analysis and data-driven modelling, not matched by current MSc and PhD programs. Financial regulators (Bank of England, FCA, ECB) are investing in analytics and modelling to face this challenge. We will develop a novel training and research agenda adapted to these needs by leveraging the considerable expertise of our teams in quantitative modelling in finance and our extensive experience in partnerships with the financial institutions and regulators.

DATA SCIENCE:

Probabilistic algorithms, such as Stochastic gradient descent and Monte Carlo Tree Search, underlie the impressive achievements of Deep Learning methods. Stochastic control provides the theoretical framework for understanding and designing Reinforcement Learning algorithms. Deeper understanding of these algorithms can pave the way to designing improved algorithms with higher predictability and 'explainable' results, crucial for applications.
We will train experts who can blend a deeper understanding of algorithms with knowledge of the application at hand to go beyond pure data analysis and develop data-driven models and decision aid tools
There is a high demand for such expertise in technology, healthcare and finance sectors and great enthusiasm from our industry partners. Knowledge transfer will be enhanced through internships, co-funded studentships and paths to entrepreneurs

Student:

Ziheng Wang

Period of Study:

Sep 20 - Sep 24

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2443932

Research Topic:

Unclassified

Organisations

People	ORCID iD
Ziheng Wang (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/S023925/1			31/03/2019	29/09/2027
2443932	Studentship	EP/S023925/1	30/09/2020	29/09/2024	Ziheng Wang