Uncertain Heterogenous Algorithmic Teamwork

Lead Research Organisation: University of Liverpool
Department Name: Electrical Engineering and Electronics

Abstract

This project is focused on showing that raw aggregated capability can outperform carefully constructed co-ordination. More specifically, while dedicated high-performance computing resources provide a carefully constructed homogeneous environment that can make best use of the available hardware, there are settings where the availability of vast quantities of computational hardware should more than make up for the disparate connectivity and capability of the hardware. Recent collaboration between the University of Liverpool and IBM Research has developed numerical Bayesian techniques that exploit homogeneous super-computing hardware to outperform algorithms that are, by default, configured to make use of a single processing core. These techniques pave the way for a generic solution to the problem of performing algorithmic teamwork in the context of data science.

This PhD will investigate whether it is possible to adapt the pre-defined divide-and-conquer algorithm at the heart of the aforementioned numerical Bayesian techniques to adapt to and operate effectively within an uncertain computational environment. This will involve the development of fast distributed algorithms to identify and instantiate a divide-and-conquer architecture that is near-optimal given the available resources. The aim of the project is to develop the infrastructure that makes it possible for vast heterogeneous compute resources to operate effectively in a team. If successful, the aim is to use spare computer infrastructure available globally to deploy algorithms based on teamwork to answer a fundamental societal question which will soon be identified.

Planned Impact

This CDT's focus on using "Future Computing Systems" to move "Towards a Data-driven Future" resonates strongly with two themes of non-academic organisation. In both themes, albeit for slightly different reasons, commodity data science is insufficient and there is a hunger both for the future leaders that this CDT will produce and the high-performance solutions that the students will develop.

The first theme is associated with defence and security. In this context, operational performance is of paramount importance. Government organisations (e.g., Dstl, GCHQ and the NCA) will benefit from our graduates' ability to configure many-core hardware to maximise the ability to extract value from the available data. The CDT's projects and graduates will achieve societal impact by enabling these government organisations to better protect the world's population from threats posed by, for example, international terrorism and organised crime.

There is then a supply chain of industrial organisations that deliver to government organisations (both in the UK and overseas). These industrial organisations (e.g., Cubica, Denbridge Marine, FeatureSpace, Leonardo, MBDA, Ordnance Survey, QinetiQ, RiskAware, Sintela, THALES (Aveillant) and Vision4ce) operate in a globally competitive marketplace where operational performance is a key driver. The skilled graduates that this CDT will provide (and the projects that will comprise the students' PhDs) are critical to these organisations' ability to develop and deliver high-performance products and services. We therefore anticipate economic impact to result from this CDT.

The second theme is associated with high-value and high-volume manufacturing. In these contexts, profit margins are very sensitive to operational costs. For example, a change to the configuration of a production line for an aerosol manufactured by Unilever might "only" cut costs by 1p for each aerosol, but when multiplied by half a billion aerosols each year, the impact on profit can be significant. In this context, industry (e.g., Renishaw, Rolls Royce, Schlumberger, ShopDirect and Unilever) is therefore motivated to optimise operational costs by learning from historic data. This CDT's graduates (and their projects) will help these organisations to perform such data-driven optimisation and thereby enable the CDT to achieve further economic impact.

Other organisations (e.g., IBM) provide hardware, software and advice to those operating in these themes. The CDT's graduates will ensure these organisations can be globally competitive.

The specific organisations mentioned above are the CDT's current partners. These organisations have all agreed to co-fund studentships. That commitment indicates that, in the short term, they are likely to be the focus for the CDT's impact. However, other organisations are likely to benefit in the future. While two (Lockheed Martin and Arup) have articulated their support in letters that are attached to this proposal, we anticipate impact via a larger portfolio of organisations (e.g., via studentships but also via those organisations recruiting the CDT's graduates either immediately after the CDT or later in the students' careers). Those organisations are likely to include those inhabiting the two themes described above, but also others. For example, an entrepreneurial CDT student might identify a niche in another market sector where Distributed Algorithms can deliver substantial commercial or societal gains. Predicting where such niches might be is challenging, though it seems likely that sectors that are yet to fully embrace Data Science while also involving significant turn-over are those that will have the most to gain: we hypothesise that niches might be identified in health and actuarial science, for example.

As well as training the CDT students to be the leaders of tomorrow in Distributed Algorithms, we will also achieve impact by training the CDT's industrial supervisors.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023445/1 01/04/2019 30/09/2027
2270559 Studentship EP/S023445/1 01/10/2019 30/09/2023 Matthew Carter
 
Description My research is centred around the following three axes:
1. Utilising distributed and high-performance compute environments to accelerate the decision making process
2. Utilising Natural Language Processing to Support Academic and Industrial Ventures
3. Utilising Machine Learning and Artificial Intelligence to Inform COVID-19 Decision Making

The main output from my PhD project (axis 1) is a framework that executes a machine learning algorithm (called a Sequential Monte Carlo (SMC) sampler) on a team of desktop PC's, laptops, android phones and GPU's (via a piece of software called HTCondor). SMC samplers can be applied to a range of fields ranging from manufacturing, to defence and security, to astronomy. Moreover, they have the potential to be several times faster than current state of the art methods and allow us to solve more complex and ambitious problems. I have worked with a company called Evergreen Life (axis 3) where I analysed customer healthcare data to gain insights into the COVID-19 pandemic. The natural language processing work (axis 2) is still largely being worked on. However, I supervised a 6-month EPSRC Summer Vacation Internship related to this project.
Exploitation Route Sampling methods (axis 1) are utilised in numerous industries: for example, the signal processing group that I sit in has partners from industries such as manufacturing, defence & security, pharmaceuticals and astronomy. A summer vacation internship was proposed of the back of axis 2, and an additional summer vacation internship based on this work is being advertised for summer 2022. The output of this axis can also be applied to a number of industries: the software developed determines what the landscape of different markets looks like by downloading papers from Scopus and analysing statistics such as funding, citations and economic/societal impact.
Sectors Aerospace, Defence and Marine,Digital/Communication/Information Technologies (including Software),Financial Services, and Management Consultancy,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology,Security and Diplomacy