Time Series Forecasting with Graphical Structure

Lead Research Organisation: University of Oxford

Abstract

Multivariate time series forecasting models are ubiquitous throughout the sciences, particularly in engineering and finance contexts. The most commonly used models consider the data as arbitrary and depend on the learned parameters to model interdependencies. In many real-world scenarios however, the underlying system has additional inherent structure which can be represented graphically and exploited to yield more accurate and interpretable predictions. Examples of this include disease propagation in epidemiology, traffic flow modelling and covariance modelling in finance.

Traditional forecasting methods often struggle to effectively handle these types of problems due to the potentially very large number of parameters involved relative to the available data (as in classical autoregressive models) or are extremely computationally expensive to deploy (as in many deep learning solutions). By utilising the underlying structure of the data however, it is often possible to simultaneously avoid overfitting and reduce the computational complexity involved. Developing and deploying forecasting schemes which take advantage of this is a fledging and promising direction of research.



Empirical similarity models build forecasts as a direct function of the previously observed data points most closely related to the test input. To do so, they consist of two fundamental components- a model of similarity and a means of combining similar observations to construct a forecast. They have several attractive properties which make them useful in practice:
-they do not require any training and new data can be integrated on an ongoing basis at no additional cost
-they are highly modular: the components can be freely interchanged even at forecast time
-they are fast and highly interpretable in practice

Empirical similarity methods have been deployed to great effect in financial contexts, particularly in volatility forecasting. We intend to adapt and develop these methods to better suit network structured data, in particular to the related problem of covariance forecasting. Covariance matrices of financial assets play a central role in modern asset management- accurate forecasting is critical for example to managing portfolio risk and pricing options. Despite this, the body of research on this topic is relatively modest, with simple vector heterogeneous autoregressive (VHAR) models still performing competitively with much more complex and computationally expensive state of the art models.

We have seen that by exploiting the network structure of this problem, it is possible to attain superior results to VHAR models using an empirical similarity-based pipeline, while retaining a high degree of interpretability. There remains much to be done in this space:

-the correct notion of similarity is central to the success of the pipeline, and this touches on the highly active field of network embedding techniques
-similarly, the choice of construction function can have a significant impact on accuracy, a consideration scarcely explored in the literature to date
-the non-stationary nature of the data means that the best results are often attained not through a single model but through diverse ensembles of models suited to different regimes;
-the creation of dynamic ensembling schemes well-suited to time series data is another promising topic of research

We intend to continue to expand on these observations with the aim of producing fast, interpretable forecasts suitable for deployment in practice. Due to its highly flexible modular structure, we hope that the resulting family of pipelines will be applicable not only to covariance forecasting problems but more broadly to any forecasting application with graphical structure.

This project falls within the following EPSRC research areas: artificial intelligence technologies, statistics and applied probability.

Planned Impact

Probabilistic modelling permeates the Financial services, healthcare, technology and other Service industries crucial to the UK's continuing social and economic prosperity, which are major users of stochastic algorithms for data analysis, simulation, systems design and optimisation. There is a major and growing skills shortage of experts in this area, and the success of the UK in addressing this shortage in cross-disciplinary research and industry expertise in computing, analytics and finance will directly impact the international competitiveness of UK companies and the quality of services delivered by government institutions.
By training highly skilled experts equipped to build, analyse and deploy probabilistic models, the CDT in Mathematics of Random Systems will contribute to
- sharpening the UK's research lead in this area and
- meeting the needs of industry across the technology, finance, government and healthcare sectors

MATHEMATICS, THEORETICAL PHYSICS and MATHEMATICAL BIOLOGY

The explosion of novel research areas in stochastic analysis requires the training of young researchers capable of facing the new scientific challenges and maintaining the UK's lead in this area. The partners are at the forefront of many recent developments and ideally positioned to successfully train the next generation of UK scientists for tackling these exciting challenges.
The theory of regularity structures, pioneered by Hairer (Imperial), has generated a ground-breaking approach to singular stochastic partial differential equations (SPDEs) and opened the way to solve longstanding problems in physics of random interface growth and quantum field theory, spearheaded by Hairer's group at Imperial. The theory of rough paths, initiated by TJ Lyons (Oxford), is undergoing a renewal spurred by applications in Data Science and systems control, led by the Oxford group in conjunction with Cass (Imperial). Pathwise methods and infinite dimensional methods in stochastic analysis with applications to robust modelling in finance and control have been developed by both groups.
Applications of probabilistic modelling in population genetics, mathematical ecology and precision healthcare, are active areas in which our groups have recognized expertise.

FINANCIAL SERVICES and GOVERNMENT

The large-scale computerisation of financial markets and retail finance and the advent of massive financial data sets are radically changing the landscape of financial services, requiring new profiles of experts with strong analytical and computing skills as well as familiarity with Big Data analysis and data-driven modelling, not matched by current MSc and PhD programs. Financial regulators (Bank of England, FCA, ECB) are investing in analytics and modelling to face this challenge. We will develop a novel training and research agenda adapted to these needs by leveraging the considerable expertise of our teams in quantitative modelling in finance and our extensive experience in partnerships with the financial institutions and regulators.

DATA SCIENCE:

Probabilistic algorithms, such as Stochastic gradient descent and Monte Carlo Tree Search, underlie the impressive achievements of Deep Learning methods. Stochastic control provides the theoretical framework for understanding and designing Reinforcement Learning algorithms. Deeper understanding of these algorithms can pave the way to designing improved algorithms with higher predictability and 'explainable' results, crucial for applications.
We will train experts who can blend a deeper understanding of algorithms with knowledge of the application at hand to go beyond pure data analysis and develop data-driven models and decision aid tools
There is a high demand for such expertise in technology, healthcare and finance sectors and great enthusiasm from our industry partners. Knowledge transfer will be enhanced through internships, co-funded studentships and paths to entrepreneurs

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023925/1 01/04/2019 30/09/2027
2594661 Studentship EP/S023925/1 01/10/2021 30/09/2025 Mark Jennings