Path-to-signature isometries with applications to modelling the long-term dynamics of complex systems

Lead Research Organisation: University of Warwick
Department Name: Statistics

Abstract

Almost all natural and man-made processes behave differently at different time scales. For example, if we plot the temperature in Coventry on a minute-by-minute scale over an hour we would expect to see small smooth changes with no clear trend. On the other hand, we would expect the weekly temperature over a year to exhibit large fluctuations and a seasonal trend. To capture the behaviour of temperature changes on all scales, we would need to use complex, high-dimensional dynamical systems. However, such systems can be extremely inefficient, which is why coarse-grained models, describing the long-term dynamics, are often used instead.

Our aim is to address the problem of fitting coarse-grained models to data. The main challenge is that coarse-grained models, while successful in providing a good approximation of the long-term dynamics, often fail to capture the fine-scale properties of the system. Typically, coarse-grained models exhibit a rougher behaviour (e.g. equivalent to Brownian motion) than the full complex systems, which in the very fine-scale are usually of bounded variation. As a result, direct use of standard estimators can lead to wrong results, unless the mismatch between model and data is carefully addressed. The main limitation of current methodology is that it depends on explicit knowledge of the scale separation parameter, which allows us to use data in a scale compatible with the coarse-grained model. However, this information is usually not available.

We will construct a new estimator based on a rapidly developing tool known as the rough path signature, which is a purpose-built tool for stochastic models with multiscale behaviour. The rough path signature is a sequence where the first term describes the behaviour of the model at a smooth scale, while the second term sees the finer Brownian scale, and so on. The limiting asymptotics of the signature capture the behaviour of the model at all scales, and it is possible to extract the behaviour in a single scale by appropriate normalisation. Our goal will be to identify the normalization that will lead to the extraction of the Brownian scale, thus providing an estimator for the diffusion coefficient, by making implicit use of the scale separation exhibited by the data.

The key theoretical underpinning of this estimator is a recently discovered formula by the co-I and his collaborator for extracting the behaviour of a path at the smooth and Brownian scales from the signature. The second objective of the project is to extend these results to the bounded variation scale. A fundamental difficulty has been how to move beyond the assumption of continuous derivative. However, the co-I and his collaborator have recently managed to achieve this in a class of two-dimensional models. We will build on this discovery to show a general formula for extracting the bounded variation behaviour from the signature in the second objective.

The proposed research is a first step towards a much larger research programme. One of the main advantages of our approach is that signature-based estimators should naturally generalise to all scales and, consequently, more general models. In order to fully develop the signature as a standard tool in multiscale modelling, we must extend this "scale-extraction" result to all scales. This will require a systematic methodology for the identification of the appropriate normalisation constant, both in the context of exact models and coarse-grained models.

Publications

10 25 50
 
Description From temperature to particle motion and stock prices, understanding the evolution of a random quantity is of great importance in many aspects of life and a fundamental part in that is quantifying uncertainty, i.e. the characteristics of fluctuations around an expected level. In classical models, the source of randomness has often been assumed to be of Brownian type and, under this assumption, there exist many methods for estimating the `diffusion coefficient' or `volatility', which is one measure of uncertainty. However, these methods often fail when applied directly to real data, as the Brownian assumption is usually a good approximation only at a certain scale. We have developed a new method for estimating the diffusion coefficient in the context of physical Brownian motion (a process that behaves like Brownian motion at a certain scale but not at the fine scale), which is based on identifying those elements of the appropriately normalised `rough path signature' - an alternative way of describing a path - that correspond to the Brownian scale. Experimental results demonstrate that the method is indeed able to successfully extract the relevant information under assumptions, thus providing a robust estimator of volatility. The ideas behind the method can be generalised to more complex systems, thus paving the way for developing a general method for extracting information from continuous data streams at different scales, so that modelling assumptions are satisfied. At the same time, it opens up many theoretical questions on the asymptotic behaviour of the `rough path signature' for different normalisations and scalings.

Another measure of uncertainty is the frequency of large movements and while existing literature provides an upper limit for it for processes driven by Brownian randomness, there is little known about the lower limit, or indeed whether the upper limit is attained in any way. As part of this project, we have managed to construct examples of models where we can rigorously demonstrate that the upper limit is in some way sharp, under a more general assumption of fractional Brownian motion (fBM) as a source of randomness, which generalises the Brownian motion assumption and, in many cases, it has been shown to be more compatible with the behaviour of data.

The use of the `rough path signature' as an alternative way for efficiently capturing information from a data stream has already had significant impact in many applications. As a final part of this project, we have started a collaboration with biologists at DKFZ, Heidelberg, aiming to develop a robust method for identifying modifications in RNA molecules from fluctuations they cause to a electric current as they move through a nanopore. Such modifications have been linked to several biological processes and could be of significance in the prognosis, diagnosis or treatment of a number of diseases.
Exploitation Route The need for precise modelling of data streams exhibiting multiscale behaviour comes up in a number of different sectors, such as the financial sector (e.g. modelling of financial data), the environment (e.g. modelling of environmental data, from climate data to animal movement), or biotechnology (e.g. modelling of the nanopore data). The PI is already working on a different project aiming to answer environmental questions by modelling and detecting changes in animal movement data. As already mentioned, the PI has also started a collaboration with a team of biologists from DKFZ, Heidelberg on modelling RNA nanopore data, aiming to detect modifications in RNA molecules.
Sectors Environment,Financial Services, and Management Consultancy,Pharmaceuticals and Medical Biotechnology

 
Description Conference Presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Around 30 international participants attended.
Year(s) Of Engagement Activity 2022
 
Description Open day talk 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact About 20 Sixth Form students with their parents attending an Open Day for prospective undergraduate students where some high level ideas were presented, aiming to increase interest in the area and attract more students.
Year(s) Of Engagement Activity 2023
URL https://warwick.ac.uk/fac/sci/statistics/courses/offerholders2023/
 
Description Presentation at Applied Probability Seminar at University of Warwick 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact 20 postgraduate students and academics attended the seminar. There has been questions and helped to increase awareness of rough path theory as a method in Probability and Statistics.
Year(s) Of Engagement Activity 2022
 
Description Workshop - Banff International Research Station 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact About 15 postgraduate students, academic colleagues and practitioners in the financial sector attended the workshop, giving us the opportunity to disseminate results.
Year(s) Of Engagement Activity 2022
URL http://www.birs.ca/events/2022/5-day-workshops/22w5116/schedule