Performance modelling and dynamic scheduling of geometric multigrid problems on heterogeneous supercomputing clusters

Lead Research Organisation: University of Warwick
Department Name: Computer Science

Abstract

The supercomputer is an essential tool for solving computational problems far too large for a single computer, problems that tend to have great significance. A supercomputer is a large collection of individual computers networked together, and large problems are processed by dividing them into pieces and giving one to each computer to process in parallel. A well known example of such a large problem is weather forecasting as performed by the Met Office, who run a detailed model of the climate on their supercomputer in order to predict the weather over the following days. However supercomputers are very expensive to obtain, upgrade and operate - the cost of purchasing or upgrading a modern high-end supercomputer with tens of thousands of microprocessors is tens of millions of pounds and its electricity bill is millions of pounds. In order for a supercomputer owner to make informed financial decisions regarding their supercomputer it is crucial to understand the relationship between their particular computational needs and the configuration of their current supercomputer and of potential future configurations.

This research is a partnership with the aerospace division of Rolls-Royce which designs and manufactures jet engines, and also with Intel who are a semiconductor manufacturer primarily known for their computer microprocessors, and the research has two goals. The first is to develop models of the relationship between the processing time of simulations in HYDRA, Rolls-Royce's engine for computing fluid dynamics on a supercomputer, and varying supercomputer configurations including that currently owned and operated by Rolls-Royce, and also to develop tools to assist with the development of those models. This entails modelling comprehensively how HYDRA divides simulations into pieces for parallel processing, then modelling how different hardware configurations fare in processing those pieces with the assistance of Intel. These models will provide two benefits: identifying inefficiencies in HYDRA, and allowing Rolls-Royce to evaluate potential upgrades to their supercomputer in regards to the effect on the performance of HYDRA.

The second goal of this research is to develop a scheduling algorithm that uses the developed performance models of HYDRA simulations in order to allocate proportions of a supercomputer to simulation jobs as they are submitted in order to maximise the ratio between overall simulation work performed and electricity consumed. Such a scheduling algorithm will not be limited to HYDRA as it will be compatible with any simulation engine that has a performance model, thus it is of potential benefit to any business or research group that operates a supercomputer.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509401/1 01/10/2015 25/02/2022
1643692 Studentship EP/N509401/1 05/10/2015 07/03/2020 Andrew Martin Owenson
 
Description 'HYDRA' is a software application used by Rolls-Royce to simulate airflow through turbofans as part of their design process. These simulations require supercomputers to provide the necessary high accuracy and resolution. Rolls-Royce are therefore interested in improving the performance of this code, both by upgrading computing hardware and by adding software optimisations to HYDRA. However HYDRA is Rolls-Royce intellectual property, complicating the evaluation of external systems, and is a large code which slows down addition of optimisations.

To address these shortcomings I have designed a "mini-application" of HYDRA, which is much smaller than HYDRA and also contains no intellectual property. Alone, this mini-application can provide indicative information on whether particular hardware or optimisations will benefit HYDRA, but not necessarily accurate information. To further improve its utility, I have developed a targeted analytical model of the 'performance difference' between the mini-application and HYDRA, which enables the former to provide accurate (< 10% error) predictions of HYDRA performance.
Exploitation Route I expect this mini-application and associated model to be used by the recently-formed ASiMoV partnership between Rolls-Royce and several UK universities. Briefly, its goal is "to achieve the world's first high fidelity simulation of a complete gas-turbine engine during operation". This requires the evaluation of new hardware and optimisations, which my recent work is well positioned to fulfil.

Url for more information on ASiMoV: https://gow.epsrc.ukri.org/NGBOViewGrant.aspx?GrantRef=EP/S005072/1
Sectors Aerospace, Defence and Marine