Data-driven methods for timeseries modelling across asset classes

Lead Research Organisation: University of Oxford

Abstract

My current research sits at the intersection between engineering and mathematical sciences, aiming to improve the modelling of cross-asset market microstructure in equity and exchange traded fund (ETF) markets. This multidisciplinary approach leverages techniques from modern statistics and machine learning to obtain insights from multi-terabyte scale, low signal to noise timeseries datasets. This research will provide the first systematic analysis of trade co-occurrence between equities and ETFs. Trade co-occurrence captures the information content of trades occurring in a short time proximity of each other. By performing the first systematic analysis of trade co-occurrence between equities and ETFs, we aim to address fundamental questions in market microstructure such as identifying arbitrage flow and detecting lead-lag relationships.

The practical applications of this research are of interest to both regulators as well as many other market participants such as high frequency trading firms (HFTs), market markets (MM) and hedge funds. For instance, the understanding of price formation mechanisms is vital to regulators who seek to understand the flow of information and systemic risk within a market. Analysis of lead-lag relationships is extremely important to market participants such as HFTs and MMs who seek to optimally provide liquidity across a broad set of asset classes.

To maximise the impact of my research, I am collaborating with senior quantitative researchers from Man Group. Man Group is the world's largest publicly traded hedge fund, and this partnership will ensure that our research objectives and methodologies align with the practical needs of industry. The technical expertise and guidance from industry practitioners will allow us to maximise our research impact.

This research offers multiple avenues for further exploration. The methodology is general enough to be adapted to other asset classes such as options and futures. We aim to extend our framework to go beyond pairwise interactions in asset classes and move to a more general graph-based framework.

This research aligns with the EPSRC's objectives of fostering innovation in mathematical sciences and engineering. By leveraging innovative statistical methods and machine learning, we aim to advance understanding in market microstructure across asset classes. The collaboration with industry stakeholders like Man Group underscores its relevance to real-world challenges, fulfilling the EPSRC's mandate for impactful and practical research.

Planned Impact

The primary CDT impact will be training 75 PhD graduates as the next generation of leaders in statistics and statistical machine learning. These graduates will lead in industry, government, health care, and academic research. They will bridge the gap between academia and industry, resulting in significant knowledge transfer to both established and start-up companies. Because this cohort will also learn to mentor other researchers, the CDT will ultimately address a UK-wide skills gap. The students will also be crucial in keeping the UK at the forefront of methodological research in statistics and machine learning.
After graduating, students will act as multipliers, educating others in advanced methodology throughout their career. There are a range of further impacts:
- The CDT has a large number of high calibre external partners in government, health care, industry and science. These partnerships will catalyse immediate knowledge transfer, bringing cutting edge methodology to a large number of areas. Knowledge transfer will also be achieved through internships/placements of our students with users of statistics and machine learning.
- Our Women in Mathematics and Statistics summer programme is aimed at students who could go on to apply for a PhD. This programme will inspire the next generation of statisticians and also provide excellent leadership training for the CDT students.
- The students will develop new methodology and theory in the domains of statistics and statistical machine learning. It will be relevant research, addressing the key questions behind real world problems. The research will be published in the best possible statistics journals and machine learning conferences and will be made available online. To maximize reproducibility and replicability, source code and replication files will be made available as open source software or, when relevant to an industrial collaboration, held as a patent or software copyright.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023151/1 01/04/2019 30/09/2027
2740734 Studentship EP/S023151/1 01/10/2022 30/09/2026 Nicolas Petit