A Unified Framework for Multiscale Machine Learning at the Edge
Lead Research Organisation:
University of Bath
Department Name: Mathematical Sciences
Abstract
Advances in data storage capabilities and an increasingly technology-driven society has resulted in the collection of vast quantities of high quality, varied and high-dimensional data. This 'big-data revolution' has in turn spurred a recent explosion in research in machine learning (ML), both applied and theoretical -- it is difficult to imagine a business sector or scientific field in which machine learning hasn't pushed forward high-throughput data analysis.
Due to increasing complexity of algorithms, machine learning is often performed in `the cloud`. A drawback of this is that data needs to be transferred to a central location and processed before individual devices are updated, which consumes a lot of energy and time. In addition, users are increasingly aware of potential security issues surrounding moving data between virtual locations. There is thus a need for machine learning tasks being performed `at the edge', for example on activity trackers, mobile phones or other smart devices. These situations typically have access to a comparatively small amount of memory and data processing capability. Researchers thus need to develop new, application-tailored machine learning algorithms for these resource-constrained environments. In these settings, algorithm efficiency and the construction of low-dimensional features for learning is of the utmost importance.
A separate issue often faced when using many machine learning algorithms is that they often have difficulties in representing data with complicated features or structure, such as shapes and directional information. This structure is often encountered in e.g. acoustic data or biomedical images. In addition, some methods do not handle missing data well; this can hamper accurate decision-making with machine learning.
For several years the PI has been at the forefront of developments in using wavelet and other multiscale signal processing methods, creating new techniques which relax traditional assumptions, and using them in new and innovative ways. Our proposal aims to address the drawbacks outlined above by developing new machine learning algorithms using so-called wavelet lifting techniques. Since such methods operate on data at different "scales", they are well-placed to represent time-varying structure or directional spatial shapes at different resolutions and across dimensions. They can also naturally handle missing data by adapting to available data sampling structures, and are memory-efficient since they use data replacement operations. Our approach will integrate these algorthms with machine learning learning methodology to widen the ability of learning algorithms to be used on low-memory devices. We aim to achieve improved robustness to missing data and test our developed methodology in a wide range of machine learning tasks, for example facial recognition, acoustic signal processing and pattern detection.
Due to increasing complexity of algorithms, machine learning is often performed in `the cloud`. A drawback of this is that data needs to be transferred to a central location and processed before individual devices are updated, which consumes a lot of energy and time. In addition, users are increasingly aware of potential security issues surrounding moving data between virtual locations. There is thus a need for machine learning tasks being performed `at the edge', for example on activity trackers, mobile phones or other smart devices. These situations typically have access to a comparatively small amount of memory and data processing capability. Researchers thus need to develop new, application-tailored machine learning algorithms for these resource-constrained environments. In these settings, algorithm efficiency and the construction of low-dimensional features for learning is of the utmost importance.
A separate issue often faced when using many machine learning algorithms is that they often have difficulties in representing data with complicated features or structure, such as shapes and directional information. This structure is often encountered in e.g. acoustic data or biomedical images. In addition, some methods do not handle missing data well; this can hamper accurate decision-making with machine learning.
For several years the PI has been at the forefront of developments in using wavelet and other multiscale signal processing methods, creating new techniques which relax traditional assumptions, and using them in new and innovative ways. Our proposal aims to address the drawbacks outlined above by developing new machine learning algorithms using so-called wavelet lifting techniques. Since such methods operate on data at different "scales", they are well-placed to represent time-varying structure or directional spatial shapes at different resolutions and across dimensions. They can also naturally handle missing data by adapting to available data sampling structures, and are memory-efficient since they use data replacement operations. Our approach will integrate these algorthms with machine learning learning methodology to widen the ability of learning algorithms to be used on low-memory devices. We aim to achieve improved robustness to missing data and test our developed methodology in a wide range of machine learning tasks, for example facial recognition, acoustic signal processing and pattern detection.
Organisations
Publications
![publication icon](/resources/img/placeholder-60x60.png)
Dupont E
(2023)
Spatial Confounding and Spatial+ for Nonlinear Covariate Effects
in Journal of Agricultural, Biological and Environmental Statistics
![publication icon](/resources/img/placeholder-60x60.png)
Dupont E
(2022)
Rejoinder to the discussions of "Spatial+: A novel approach to spatial confounding".
in Biometrics
![publication icon](/resources/img/placeholder-60x60.png)
McGonigle E
(2022)
Trend locally stationary wavelet processes
in Journal of Time Series Analysis
Description | This award has enabled development of new, more realistic statistical models for data recorded over time and space. These models have demonstrated that in complex time-dependent data, our models provide more accurate interpretation of important features in the data. We have also added to new understanding of when model identification is an issue, and how to test for contradicting effects of external drivers included in the model, especially in the spatial data setting. The award has also inspired a new direction of research at the interface between machine learning and ecology. |
Exploitation Route | There is potential for the outcomes of this award to be taken up by a range of applied scientists. However, it is most likely that methodological statisticians and investigators of patterns in ecological data will be able to gain scientific insight by using the developed methodology. |
Sectors | Environment |
Description | Talk at international interdisciplinary workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Talk by Emiko Dupont to introduce practitioner community to new statistical methods, particularly focussed on deriving insight from complex datasets |
Year(s) Of Engagement Activity | 2023 |