Scalable Online Machine Learning

Lead Research Organisation: University of Liverpool
Department Name: Computer Science

Abstract

Many of the current problems within the modern machine learning sector involve dealing with ever growing datasets where accurately and efficiently estimating a posterior distribution to make predictions is very difficult due to being able to identify the correct information from datasets which is informative to modelling our target distribution. As well as volume, these models often increase in dimensionality (and therefore complexity) which current common methods such as Markov Chain Monte Carlo (MCMC)- sampling struggle to do efficiently, as they become much slower as the complexity of the model increases.

Other types of sampling methods though have shown to scale much better than traditional MCMC. For example, Sequential Monte Carlo sampling (SMC) and Hamiltonian Monte Carlo (HMC) sampling scale a lot better with dimensionality. These techniques are nowhere near as well researched though and have their own problems, so improving upon these further by introducing Reversible Jump (RJ) MCMC and Mass Matrices (MM) will be a main stay of my research.

My colleague Josh Murphy will be working on Concept Drift, and I will be working on Eternal Learning. Initially there may be some similarities on the research we undertake but due to the differences in problem concepts, this will start to diverge within the first year. However, further down the line there will likely be some collaboration as the methods developed in each area may have some overlap depending on the specific problem we are undertaking. Therefore, I will also be researching methods to compress data from a constant and endless source but so that little to no information is lost during our inference with the aforementioned sampling methods. This will need to be done, as for eternal learning, the first sample will be just as important as the most recent one.

Finally, these new methods will be applied to a Bayesian deep learning context. In neural networks, backpropagation for use in calculating the weights and biases of models is very computationally expensive. Early research with using particle filters as an alternative (or in an ensemble method) to backpropagation has recently started in the past couple of years as they are less computationally expensive. I will expand upon this by implementing a quasi-differentiable SMC sampler (as opposed to a particle filter) to aid the optimization process in neural networks.

Planned Impact

This CDT's focus on using "Future Computing Systems" to move "Towards a Data-driven Future" resonates strongly with two themes of non-academic organisation. In both themes, albeit for slightly different reasons, commodity data science is insufficient and there is a hunger both for the future leaders that this CDT will produce and the high-performance solutions that the students will develop.

The first theme is associated with defence and security. In this context, operational performance is of paramount importance. Government organisations (e.g., Dstl, GCHQ and the NCA) will benefit from our graduates' ability to configure many-core hardware to maximise the ability to extract value from the available data. The CDT's projects and graduates will achieve societal impact by enabling these government organisations to better protect the world's population from threats posed by, for example, international terrorism and organised crime.

There is then a supply chain of industrial organisations that deliver to government organisations (both in the UK and overseas). These industrial organisations (e.g., Cubica, Denbridge Marine, FeatureSpace, Leonardo, MBDA, Ordnance Survey, QinetiQ, RiskAware, Sintela, THALES (Aveillant) and Vision4ce) operate in a globally competitive marketplace where operational performance is a key driver. The skilled graduates that this CDT will provide (and the projects that will comprise the students' PhDs) are critical to these organisations' ability to develop and deliver high-performance products and services. We therefore anticipate economic impact to result from this CDT.

The second theme is associated with high-value and high-volume manufacturing. In these contexts, profit margins are very sensitive to operational costs. For example, a change to the configuration of a production line for an aerosol manufactured by Unilever might "only" cut costs by 1p for each aerosol, but when multiplied by half a billion aerosols each year, the impact on profit can be significant. In this context, industry (e.g., Renishaw, Rolls Royce, Schlumberger, ShopDirect and Unilever) is therefore motivated to optimise operational costs by learning from historic data. This CDT's graduates (and their projects) will help these organisations to perform such data-driven optimisation and thereby enable the CDT to achieve further economic impact.

Other organisations (e.g., IBM) provide hardware, software and advice to those operating in these themes. The CDT's graduates will ensure these organisations can be globally competitive.

The specific organisations mentioned above are the CDT's current partners. These organisations have all agreed to co-fund studentships. That commitment indicates that, in the short term, they are likely to be the focus for the CDT's impact. However, other organisations are likely to benefit in the future. While two (Lockheed Martin and Arup) have articulated their support in letters that are attached to this proposal, we anticipate impact via a larger portfolio of organisations (e.g., via studentships but also via those organisations recruiting the CDT's graduates either immediately after the CDT or later in the students' careers). Those organisations are likely to include those inhabiting the two themes described above, but also others. For example, an entrepreneurial CDT student might identify a niche in another market sector where Distributed Algorithms can deliver substantial commercial or societal gains. Predicting where such niches might be is challenging, though it seems likely that sectors that are yet to fully embrace Data Science while also involving significant turn-over are those that will have the most to gain: we hypothesise that niches might be identified in health and actuarial science, for example.

As well as training the CDT students to be the leaders of tomorrow in Distributed Algorithms, we will also achieve impact by training the CDT's industrial supervisors.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023445/1 01/04/2019 30/09/2027
2599529 Studentship EP/S023445/1 01/10/2021 30/09/2025 Andrew Millard