📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

aeon: a toolkit for machine learning with time series

Lead Research Organisation: University of Southampton
Department Name: Electronics and Computer Science

Abstract

In recent years, machine learning frameworks such as scikit-learn have become essential infrastructure of modern data science. They have become the principal tool for practitioners and central components in scientific, commercial and industrial applications. But despite the ubiquity of time series data, until recently, no such framework exists for machine learning with time series. In 2019, sktime was conceived to fill this gap and it has become an established toolkit and software component for time series analysis used world-wide by academics and industry alike.
It is an easy-to-use, flexible and modular framework for a wide range of time series machine learning tasks. Techniques for learning from time series have been developed in a range of disciplines, including: statistics; machine learning; signal processing; econometrics; and finance. sktime aims to link these communities by providing a unified interface for related time series tasks such as forecasting, classification, clustering, regression, annotation, anomaly detection and segmentation. It provides scikit-learn compatible algorithms and gives easy access to implementations of state of the art algorithms not accessible in other packages. This project will allow sktime to continue to sustain and grow its operations by providing dedicated maintenance resource, enhancing the functionality and increasing engagement with scientific and industrial stakeholders. We wish to broaden the functionality of sktime to include new areas of active machine learning research and deepen our user base to reach new communities of researchers. Our aim is to link theory and practice by making it easier and faster for state of the art time series algorithms to be applied to real world problems of genuine scientific interest. To demonstrate this potential we will collaborate with domain experts on two applications. The first relates to predicting the early onset of dementia using electroencephalography (EEG). EEG are time series that record electrical activity in the brain using a series electrodes placed on the scalp. The equipment is relatively cheap and portable. If we could use it to screen for early onset dementia it could make a huge difference to the outcomes for many patients. However, the accuracy needed for clinical use is very hard to achieve. We will collaborate with experts in Cambridge who have clinical data and see if the state of the art predictive models can outperform traditional approaches. The second application involves analysing data generated from intensive care monitoring of children in Great Ormond Street Hospital (GOSH). Intensive care patients are continually monitored for vital body functions (heart rate, blood pressure, breathing rate, etc). Increasingly, this time series data is captured and can be mined to improve clinical practice. We will collaborate with a research team already working with GOSH to explore whether sktime can be used to decrease the time it takes to analyse this data.
This research may lead to insights that improve clinical practice by answering questions such as "when is the best time to remove the tube that is helping a patient breathe?". It will also help us reach our broader goal to speed up the discovery and dissemination of best practice. Data sharing between hospitals is, quite sensibly, difficult and time consuming. We wish to develop a new user base of hospital data scientists willing to share their research findings and code rather than their data. So, for example, if we discover something interesting in the GOSH data, we would like to rapidly share this finding and the code that verifies it in our data. This code sharing via sktime will dramatically reduce the time taken to test hypotheses on different observational data sets and give greater confidence in finding verified on independent groups of patients conducted transparently by different researchers.

Publications

10 25 50

Related Projects

Project Reference Relationship Related To Start End Award Value
EP/W030756/1 30/09/2022 30/07/2023 £534,661
EP/W030756/2 Transfer EP/W030756/1 31/07/2023 30/05/2026 £403,617
 
Description Aeon is a Scikit-learn-compatible, open-source Python toolkit for time series machine learning (TSML). It has three core goals:

1. To provide an open framework that promotes reproducible research in TSML.
2. To offer easy access to cutting-edge algorithmic advancements for scientists and practitioners working with time series data.
3. To support newcomers in understanding key challenges and concepts in the field.
Since our relaunch in 2022, aeon has grown organically, now consisting of a diverse team of 15 core developers from 10 different nationalities and contributions from over 100 individuals. Researchers from countries including the UK, Ireland, Spain, France, Germany, Italy, Brazil, and Australia have shared open-source implementations of TSML algorithms featured in high-impact publications. We collaborate with academic and industry researchers in fields such as medicine, engineering, cybersecurity, materials science, biology, and environmental science to evaluate the impact of state-of-the-art Aeon algorithms on real-world problems. Additionally, we actively participate in mentoring programs like Google Summer of Code and are developing educational resources based on aeon.

Our mission is to continue expanding Aeon's functionality, usability, and user base to advance research, innovation, and accessibility in time series machine learning.
Exploitation Route Time series arise in all fields, and the learning problems related to time series are common. These include time series specific cases of machine learning tasks such as classification, regression and clustering, but also time series specific tasks such as forecasting, segmentation and anomaly detection. aeon provides an easy to use open solution to these problems with access to the latest research findings.
Sectors Aerospace

Defence and Marine

Digital/Communication/Information Technologies (including Software)

Electronics

Energy

Environment

Healthcare

URL http://aeon-toolkit.org
 
Description Our research is made easier and more robust because of aeon. Our findings in classification, regression and clustering have helped move the fields forward and set new benchmarks for state of the art. aeon is used by researchers around the world. We have helped researchers from other fields use aeon to improve their results. This includes work in EEG classification, human activity recognition, materials science and environmental science.
First Year Of Impact 2023
Sector Digital/Communication/Information Technologies (including Software),Environment,Healthcare