EPSRC Centre for Doctoral Training in Distributed Algorithms: the what, how and where of next-generation data science

Lead Research Organisation: University of Liverpool
Department Name: Electrical Engineering and Electronics

Abstract

This CDT will train a cohort of 60 students to have the skills and experience that enables them to become leaders in Distributed Algorithms: capitalising on "Future Computing Systems" to move "Towards a Data-Driven Future".

Commodity Data Science is already pervasive. This motivates today's pressing need for highly-trained data scientists. This CDT will empower tomorrow's leaders of data science. The UK (and world) needs data scientists that can best exploit tomorrow's computational resources to harvest the new 'oil': the information present in data.

As our graduates' careers progress, many cored architectures will become increasingly commonplace. We anticipate millions more cores in tomorrow's desktops than today's. This core count will challenge the assumption made by current Big Data middleware (e.g., Spark and TensorFlow) that the details of future computing systems can be decoupled from the development of data science tools and techniques. More specifically, it will become imperative that data scientists understand how to design algorithms that can operate effectively in environments where data movement is the key performance bottleneck.

To meet this need, we will provide training that ensures we generate highly-employable individuals who have both an understanding of the design of future computer hardware as well as an understanding of how and when to flex the algorithmic solutions to best exploit the computational resources that will exist in the future.

From the outset, the students will be embedded in a computing environment that anticipates the hardware resources that will arrive on their desks after they graduate, not the hardware that exists today. The cohort of students provides the critical mass that motivates engagement with internationally-leading supercomputing centres: STFC's Hartree Centre is an integral part of the team; links we have established with IBM Research in the US will provide students with access to state-of-the-art computing hardware. This anticipation of future computing capability will ensure our graduates are highly employable, but also help motivate end-user organisations to engage with the CDT.

We have identified such end-user organisations that span two themes: defence and security; manufacturing. Organisations in these themes are driven by performance demands and efficiency requirements respectively.

We will align the training we provide with the needs of the cohort, the theme and the individual. Each studentship will have two academic supervisors (one aligned with the "Future Computing Systems" and one aligned with moving "Towards a Data-Driven Future") and at least one supervisor from a project partner. This supervisory team will co-define the scope of each studentship. Once the high quality student has been selected and recruited, we will work with the student to define the training that aligns with their needs and the specific demands of the studentship. Our training provision will include the training needs associated with both the "Future Computing Systems" and "Towards a Data-Driven Future" priority areas. We will use guest lectures from, for example, IBM (as used to train Fast Track civil servants) and UC Berkeley to ensure we maximise our graduates' ability to thrive and to become tomorrow's leaders in Distributed Algorithms.

Planned Impact

This CDT's focus on using "Future Computing Systems" to move "Towards a Data-driven Future" resonates strongly with two themes of non-academic organisation. In both themes, albeit for slightly different reasons, commodity data science is insufficient and there is a hunger both for the future leaders that this CDT will produce and the high-performance solutions that the students will develop.

The first theme is associated with defence and security. In this context, operational performance is of paramount importance. Government organisations (e.g., Dstl, GCHQ and the NCA) will benefit from our graduates' ability to configure many-core hardware to maximise the ability to extract value from the available data. The CDT's projects and graduates will achieve societal impact by enabling these government organisations to better protect the world's population from threats posed by, for example, international terrorism and organised crime.

There is then a supply chain of industrial organisations that deliver to government organisations (both in the UK and overseas). These industrial organisations (e.g., Cubica, Denbridge Marine, FeatureSpace, Leonardo, MBDA, Ordnance Survey, QinetiQ, RiskAware, Sintela, THALES (Aveillant) and Vision4ce) operate in a globally competitive marketplace where operational performance is a key driver. The skilled graduates that this CDT will provide (and the projects that will comprise the students' PhDs) are critical to these organisations' ability to develop and deliver high-performance products and services. We therefore anticipate economic impact to result from this CDT.

The second theme is associated with high-value and high-volume manufacturing. In these contexts, profit margins are very sensitive to operational costs. For example, a change to the configuration of a production line for an aerosol manufactured by Unilever might "only" cut costs by 1p for each aerosol, but when multiplied by half a billion aerosols each year, the impact on profit can be significant. In this context, industry (e.g., Renishaw, Rolls Royce, Schlumberger, ShopDirect and Unilever) is therefore motivated to optimise operational costs by learning from historic data. This CDT's graduates (and their projects) will help these organisations to perform such data-driven optimisation and thereby enable the CDT to achieve further economic impact.

Other organisations (e.g., IBM) provide hardware, software and advice to those operating in these themes. The CDT's graduates will ensure these organisations can be globally competitive.

The specific organisations mentioned above are the CDT's current partners. These organisations have all agreed to co-fund studentships. That commitment indicates that, in the short term, they are likely to be the focus for the CDT's impact. However, other organisations are likely to benefit in the future. While two (Lockheed Martin and Arup) have articulated their support in letters that are attached to this proposal, we anticipate impact via a larger portfolio of organisations (e.g., via studentships but also via those organisations recruiting the CDT's graduates either immediately after the CDT or later in the students' careers). Those organisations are likely to include those inhabiting the two themes described above, but also others. For example, an entrepreneurial CDT student might identify a niche in another market sector where Distributed Algorithms can deliver substantial commercial or societal gains. Predicting where such niches might be is challenging, though it seems likely that sectors that are yet to fully embrace Data Science while also involving significant turn-over are those that will have the most to gain: we hypothesise that niches might be identified in health and actuarial science, for example.

As well as training the CDT students to be the leaders of tomorrow in Distributed Algorithms, we will also achieve impact by training the CDT's industrial supervisors.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023445/1 01/04/2019 30/09/2027
2299114 Studentship EP/S023445/1 01/09/2019 31/12/2023 Marco Fontana
2297756 Studentship EP/S023445/1 01/10/2019 31/03/2022 Konstantinos Alexandridis
2298149 Studentship EP/S023445/1 01/10/2019 30/09/2023 Theofilos Triommatis
2298140 Studentship EP/S023445/1 01/10/2019 05/01/2022 Carlos Tiago de Melo Mota Ferreira Arinto
2338193 Studentship EP/S023445/1 01/10/2019 30/09/2023 Julia Kolaszynska
2270559 Studentship EP/S023445/1 01/10/2019 30/09/2023 Matthew Carter
2297823 Studentship EP/S023445/1 01/10/2019 30/09/2023 Vincent Beraud
2298146 Studentship EP/S023445/1 01/10/2019 30/09/2023 Emmanouil Pitsikalis
2445280 Studentship EP/S023445/1 01/10/2020 30/09/2024 Pangiotis Pentaliotis
2445278 Studentship EP/S023445/1 01/10/2020 30/09/2024 Jack Wells
2445289 Studentship EP/S023445/1 01/10/2020 30/09/2024 Efthyvoulos Drousiotis
2447388 Studentship EP/S023445/1 01/10/2020 20/12/2024 Elinor Davies
2447387 Studentship EP/S023445/1 01/10/2020 30/09/2024 Benedict Oakes
2447391 Studentship EP/S023445/1 01/10/2020 30/09/2024 Mehdi Anhichem
2447389 Studentship EP/S023445/1 05/10/2020 04/10/2024 Oisin Boyle
2467925 Studentship EP/S023445/1 01/11/2020 31/10/2024 Adam Lee
2476782 Studentship EP/S023445/1 01/12/2020 30/11/2024 Jordan Robinson
2599528 Studentship EP/S023445/1 01/10/2021 30/09/2025 Kieron McCallan
2599527 Studentship EP/S023445/1 01/10/2021 30/09/2025 George Jones
2599524 Studentship EP/S023445/1 01/10/2021 30/09/2025 Benjamin Rise
2599529 Studentship EP/S023445/1 01/10/2021 30/09/2025 Andrew Millard
2599531 Studentship EP/S023445/1 01/10/2021 01/07/2023 Jack Taylor
2599530 Studentship EP/S023445/1 01/10/2021 30/09/2025 Joshua Murphy
2599525 Studentship EP/S023445/1 01/10/2021 30/09/2025 Alexander Bird
2599526 Studentship EP/S023445/1 01/10/2021 20/12/2025 Christian Blackman
2636081 Studentship EP/S023445/1 01/11/2021 31/10/2025 William Pearson
2644638 Studentship EP/S023445/1 01/11/2021 31/10/2025 Jinhao Gu
2636034 Studentship EP/S023445/1 01/11/2021 31/10/2025 William Jeffcott
2640133 Studentship EP/S023445/1 01/12/2021 30/11/2025 Oliver Dippel
2640147 Studentship EP/S023445/1 01/12/2021 30/11/2025 Jianyang Xie
2748709 Studentship EP/S023445/1 01/10/2022 31/01/2027 John Bentas
2748733 Studentship EP/S023445/1 01/10/2022 30/09/2026 Bettina Hanlon
2748703 Studentship EP/S023445/1 01/10/2022 30/09/2026 Sarah Askevold
2748722 Studentship EP/S023445/1 01/10/2022 30/09/2026 Georgios Chionas
2748812 Studentship EP/S023445/1 01/10/2022 30/09/2026 Adam Neal
2748834 Studentship EP/S023445/1 01/10/2022 30/09/2026 Joshua Wakefield
2748743 Studentship EP/S023445/1 01/10/2022 30/09/2026 Harvinder Lehal
2748823 Studentship EP/S023445/1 01/10/2022 30/09/2026 Dominika Soltysik
2748750 Studentship EP/S023445/1 01/10/2022 30/09/2026 Carole Liao
2771570 Studentship EP/S023445/1 01/11/2022 31/10/2026 William Shaw
2799421 Studentship EP/S023445/1 01/12/2022 30/11/2026 Tymofii Prokopenko
2889696 Studentship EP/S023445/1 01/10/2023 30/09/2027 Teodor-Avram Ciochirca
2889818 Studentship EP/S023445/1 01/10/2023 30/09/2027 Daniel Sumler
2889845 Studentship EP/S023445/1 01/10/2023 30/09/2027 Ruojun Zhang
2889679 Studentship EP/S023445/1 01/10/2023 30/09/2027 Finlay Boulton
2889824 Studentship EP/S023445/1 01/10/2023 30/09/2027 Adam Williams
2889729 Studentship EP/S023445/1 01/10/2023 30/09/2027 Omree Naim
2889834 Studentship EP/S023445/1 01/10/2023 30/09/2027 Alexander Williams
2889812 Studentship EP/S023445/1 01/10/2023 30/09/2027 Christian Pollitt
2889687 Studentship EP/S023445/1 01/10/2023 30/09/2027 Daniel Chadwick
2889699 Studentship EP/S023445/1 01/10/2023 30/09/2027 Finn Henman
2889839 Studentship EP/S023445/1 01/10/2023 30/09/2027 Wanrong Yang
2889721 Studentship EP/S023445/1 01/10/2023 30/09/2027 Richard Jinschek
2889801 Studentship EP/S023445/1 01/10/2023 30/09/2027 Daniel Nash
2889702 Studentship EP/S023445/1 01/10/2023 30/09/2027 Wenping Jiang