A Unified Framework for Multiscale Machine Learning at the Edge

Lead Research Organisation: University of Bath
Department Name: Mathematical Sciences

Abstract

Advances in data storage capabilities and an increasingly technology-driven society has resulted in the collection of vast quantities of high quality, varied and high-dimensional data. This 'big-data revolution' has in turn spurred a recent explosion in research in machine learning (ML), both applied and theoretical -- it is difficult to imagine a business sector or scientific field in which machine learning hasn't pushed forward high-throughput data analysis.

Due to increasing complexity of algorithms, machine learning is often performed in `the cloud`. A drawback of this is that data needs to be transferred to a central location and processed before individual devices are updated, which consumes a lot of energy and time. In addition, users are increasingly aware of potential security issues surrounding moving data between virtual locations. There is thus a need for machine learning tasks being performed `at the edge', for example on activity trackers, mobile phones or other smart devices. These situations typically have access to a comparatively small amount of memory and data processing capability. Researchers thus need to develop new, application-tailored machine learning algorithms for these resource-constrained environments. In these settings, algorithm efficiency and the construction of low-dimensional features for learning is of the utmost importance.

A separate issue often faced when using many machine learning algorithms is that they often have difficulties in representing data with complicated features or structure, such as shapes and directional information. This structure is often encountered in e.g. acoustic data or biomedical images. In addition, some methods do not handle missing data well; this can hamper accurate decision-making with machine learning.

For several years the PI has been at the forefront of developments in using wavelet and other multiscale signal processing methods, creating new techniques which relax traditional assumptions, and using them in new and innovative ways. Our proposal aims to address the drawbacks outlined above by developing new machine learning algorithms using so-called wavelet lifting techniques. Since such methods operate on data at different "scales", they are well-placed to represent time-varying structure or directional spatial shapes at different resolutions and across dimensions. They can also naturally handle missing data by adapting to available data sampling structures, and are memory-efficient since they use data replacement operations. Our approach will integrate these algorthms with machine learning learning methodology to widen the ability of learning algorithms to be used on low-memory devices. We aim to achieve improved robustness to missing data and test our developed methodology in a wide range of machine learning tasks, for example facial recognition, acoustic signal processing and pattern detection.

Publications

10 25 50
publication icon
Dupont E (2023) Spatial Confounding and Spatial+ for Nonlinear Covariate Effects in Journal of Agricultural, Biological and Environmental Statistics

publication icon
McGonigle E (2022) Trend locally stationary wavelet processes in Journal of Time Series Analysis

 
Description This award has enabled development of new, more realistic statistical models for data recorded over time and space. These models have demonstrated that in complex time-dependent data, our models provide more accurate interpretation of important features in the data. We have also added to new understanding of when model identification is an issue, and how to test for contradicting effects of external drivers included in the model, especially in the spatial data setting. The award has also inspired a new direction of research at the interface between machine learning and ecology.
Exploitation Route There is potential for the outcomes of this award to be taken up by a range of applied scientists. However, it is most likely that methodological statisticians and investigators of patterns in ecological data will be able to gain scientific insight by using the developed methodology.
Sectors Environment

 
Description Talk at international interdisciplinary workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk by Emiko Dupont to introduce practitioner community to new statistical methods, particularly focussed on deriving insight from complex datasets
Year(s) Of Engagement Activity 2023