Malleability in resource allocation for improved system efficiency in high-performance computing

Lead Research Organisation: University of Edinburgh

Department Name: Edinburgh Parallel Computing Centre

Abstract

A significant part of the environmental impact and CO2 emissions of a high-performance computing (HPC) system can be attributed to its manufacturing as well as its operation (including running idle). Once a system has been installed, it is therefore imperative that it is used as close to full capacity as possible and that science throughput should be maximised at all times, in order to get the best return on investment on both the monetary and carbon cost of the system. This highly desirable 100% utilisation rate is however near impossible to achieve in practice. The workload of a system is managed by its resource allocator, which attempts to place jobs from a submission queue (that users continuously add new jobs to) to fill gaps in the available resources. It is not always possible to attain perfect job placement and as a result, resources sit idle.

Malleability in resource allocation introduces the concept that the resources (the number of compute cores or nodes, or even the system) that have been requested by a user at job submission time are not fixed and can be changed if this change means a job can be scheduled to run, and thus complete, sooner.

MIRA ("Malleability In Resource Allocation for improved system efficiency in high-performance computing") will investigate the concept of malleability in compute resource allocation within a single system as well as across multiple systems, to improve overall system utilisation and science throughput, thereby maximising the "science per Joule'' that can be achieved.

Funded Value:

£163,412

Funded Period:

Jan 24 - Jun 25

Funder:

EPSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

EP/Y53061X/1

Principal Investigator:

Mark Parsons

Research Subject:

Energy (50%)

Info. & commun. Technol. (50%)

Research Topic:

Energy Efficiency (50%)

Software Engineering (50%)

Organisations

University of Edinburgh (Lead Research Organisation)

People	ORCID iD
Mark Parsons (Principal Investigator)	http://orcid.org/0000-0003-4097-7468
Michele Weiland (Co-Investigator)	http://orcid.org/0000-0003-4713-3073

Publications

Author Name

Title Publication Date Published

10 25 50

Abstract

Organisations

People

ORCID iD

Publications