GW4 Tier-2 HPC Centre for Advanced Architectures

Lead Research Organisation: University of Bristol
Department Name: Computer Science

Abstract

This proposal, led by Bristol, is from the Consortium behind the successful Isambard Tier-2 HPC service [1]: the GW4 Alliance of Bath, Bristol, Cardiff and Exeter, in partnership with Cray and the Met Office. Isambard delivered the world's first production, Arm-based HPC service, creating significant impact for the world-wide HPC community in the process. The Isambard 2 project aims to build on this success, adding the next generation of exciting, Exascale-enabling, Arm-based technologies (Fujitsu's A64fx), while significantly expanding and upgrading the current service to meet growing demand. The A64fx CPU is generating excitement because it delivers memory bandwidth and floating point capability that is more like a GPU than a CPU (approximately 1 TeraByte/s and 3 TeraFLOP/s double precision, respectively). Providing access to these transformational technologies in parallel with an expanded Arm-based production system as a national service would be extremely advantageous to the UK's HPC community, supporting the porting and optimising of codes for future Exascale systems (the Fugaku system at RIKEN in Japan is just one example known to be based on A64fx). The Isambard 2 system will also provide one of the world's most comprehensive Multi Architecture Comparison System (MACS), significantly expanding the successful Isambard 1 MACS. The ability to compare next generation technologies with a constantly updated set of best-in-class CPUs and GPUs is a crucial part of any scientifically rigorous architectural evaluation. Being a Cray system, we will continue to uniquely offer their class-leading software toolchain to the Tier-2 community, a crucial capability to enable upward mobility from Tier-3 through Tier-2 to Tier-1.

[1] https://gw4-isambard.github.io/docs/

Planned Impact

The Isambard 2 proposal from the GW4 Alliance, the Met Office and Cray will have many positive impacts for the UK's HPC community, our wider society, and for the UK's economy. These are detailed in the PtI document, but the first 5 are described below:

1) Increase the rate of adoption for UK technology in the HPC marketplace. Our system will be one of the first in the world to include next-generation Arm HPC technologies, and one of the first to be made widely available to the HPC community, both academic and commercial. Today x86 processors account for around 85% of Top500 machines, and the UK-based Arm Ltd is aiming to win a significant fraction of the overall $19.8 Billion server market . Isambard 1's evaluation results proved for the first time that Arm CPUs were competitive in HPC. Isambard 2 aims to shift this from "competitive" to "class leading", enabling more rapid adoption of these technologies, financially benefiting the UK economy. Isambard 2 will be available to scientific code developers and Independent Software Vendors (ISVs), increasing the rate at which the codes and libraries that we all rely on are ported and ready for next generation, class leading Arm CPUs.

2) Informing future procurements, from Tier-0 to Tier-3. We are just emerging from a long period where mainstream HPC CPU technologies suffered from a lack of competition. Arm's business model is licensing designs, which results in a vibrant ecosystem of CPU vendors all competing with one another. If the Isambard 2 project proves that next generation Arm technologies are class-leading, this should increase competition between HPC processor vendors, driving improvements in cost effectiveness and improving the rates of innovation once again.

3) Increasing research ties with leading HPC centres around the world. The next generation Arm technologies we will use in Isambard 2 are generating high levels of interest. As ours will be one of the first such systems in the world, leading HPC centres have approached us asking to collaborate, establish new networks of expertise, and to share results. Supporters of our proposal include some of the top HPC experts in the world, such as Prof. Satoshi Matsuoka, director of RIKEN, Japan; Dr Thuc Hoang, NNSA program manager, USA; Dr Si Hammond at Sandia National Laboratory, USA; Prof. Robert Harrison at Stony Brook, USA; Dr Rob Akers at UKAEA/CCFE, UK; AWE's deputy chief scientist, Dr Sylvain Laizet, chair of the UK Turbulence Consortium, Prof. Angelo Michaelides, Director of the Materials and Molecular Modelling Hub, Prof. Deborah Greaves, PI for CCP-WSI, Prof. Adrian Mulholland, PI for CCP-BioSim, and lead developers from the VASP and GROMACS teams, amongst others (18 in all).

4) Increasing vertical and horizontal integration, promoting greater mobility between Tier-3 and Tier-1, and between Tier-2 sites. The Consortium already works with most of the other Tier-2's in the UK, and by working closely with Cray, we provide an easy route for Tier-3 scientists to the many Cray Tier-1, Tier-0 and emerging Exascale sites around the world.

5) User Training and Workshops are an essential component of the Isambard 2 proposal, and these will be made openly available to all UK EPS researchers and beyond. Isambard 1 has been a tremendous enabler of advanced tutorials for GPU programming and porting to Arm, with 22 courses and hackathons supported for over 300 attendees to date. Our learnings have been published in a series of peer reviewed papers, one of which won the prestigious "Best Paper" award at CUG 2019, which published all the build and run scripts we have developed as best practise for Arm processors in an online open source repository . For Isambard 2, we will continue to distil this learning into a series of online user guides, best practice documents and training material. New workshops and training sessions will be run to support the community in rapidly being able to exploit the new technology.

Publications

10 25 50