Edinburgh DiRAC Resource Grant

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Physics and Astronomy

Abstract

DiRAC (Distributed Research utilising Advanced Computing) is the integrated supercomputing facility for theoretical modelling and HPC-based research in particle physics, nuclear physics, astronomy and cosmology, areas in which the UK is world-leading. It was funded as a result of investment of £12.32 million, from the Government's Large Facilities Capital Fund, together with investment from STFC and from universities. In 2012, the DiRAC facility was upgraded with a further £15 million capital investment from government (DiRAC-2).

The DiRAC facility provides a variety of computer architectures, matching machine architecture to the algorithm design and requirements of the research problems to be solved. The science facilitated includes: using supercomputers to enable scientists to calculate what theories of the early universe predict and to test them against observations of the present universe; undertaking lattice field theory calculations whose primary aim is to increase the predictive power of the Standard Model of elementary particle interactions through numerical simulation of Quantum Chromodynamics; carrying out state-of-the-art cosmological simulations, including the large-scale distribution of dark matter, the formation of dark matter haloes, the formation and evolution of galaxies and clusters, the physics of the intergalactic medium and the properties of the intracluster gas.

This grant is to support the continued operation of the DiRAC facilities until 2017 to ensure that the UK remains one of the world-leaders of theoretical modelling in particle physics, astronomy and cosmology.

Planned Impact

The high-performance computing applications supported by DiRAC typically involve new algorithms and implementations optimised for high energy efficiency which impose demands on computer architectures that the computing industry has found useful for hardware and system software design and testing.

DiRAC researchers have on-going collaborations with computing companies that maintain this strong connection between the scientific goals of the DiRAC Consortium and the development of new computing technologies that drive the commercial high-performance computing market, with economic benefits to the companies involved and more powerful computing capabilities available to other application areas including many that address socio-economic challenges.

Publications

10 25 50
publication icon
Aarts G (2016) Finite Temperature Lattice QCD --- Baryons in the Quark--Gluon Plasma in Acta Physica Polonica B Proceedings Supplement

publication icon
Aarts G (2015) The Phase Diagram of Heavy Dense QCD with Complex Langevin Simulations in Acta Physica Polonica B Proceedings Supplement

publication icon
Aarts G (2016) Complex Langevin in Lattice QCD: Dynamic Stabilisation and the Phase Diagram in Acta Physica Polonica B Proceedings Supplement

publication icon
Constantino T (2021) Suppression of lithium depletion in young low-mass stars from fast rotation in Astronomy & Astrophysics

publication icon
Hildebrandt H (2020) KiDS+VIKING-450: Cosmic shear tomography with optical and infrared data in Astronomy & Astrophysics

publication icon
Ziampras A (2023) Hydrodynamic turbulence in disks with embedded planets in Astronomy & Astrophysics

 
Description In December 2009, the STFC Facility, DiRAC, was established to provide distributed High Performance Computing (HPC) services for theoretical modelling and HPC-based research in particle physics, astronomy and cosmology. DiRAC provides a variety of computer architectures, matching machine architecture to the algorithm design and requirements of the research problems to be solved. This grant funds the continued operation of the 1.3Pflop/s Blue Gene/Q system at the University of Edinburgh, which was co-developed by Peter Boyle (University of Edinburgh) and IBM to run with high energy efficiency for months at a time on a single problem to solve some of the most complex problems in physics, particularly the strong interactions of quarks and gluons. The DiRAC Facility supports over 250 active researchers at 27 UK HEIs. This includes the research projects of 100 funded research staff (PDRAs and Research Fellows), over 50 post-graduate projects, and £1.6M of funded research grants.
Exploitation Route Theoretical results obtained input to a range of experimental programmes aiming to increase our understanding of Nature. Algorithms and software developed provide input to computer design.
Sectors Digital/Communication/Information Technologies (including Software)

URL http://dirac.ac.uk/
 
Description Intel IPAG QCD codesign project 
Organisation Intel Corporation
Department Intel Corporation (Jones Farm)
Country United States 
Sector Private 
PI Contribution We have collaborated with Intel corporation since 2014 with $720k of total direct funding, starting initially as an Intel parallel computing centre, and expanding to direct close collaboration with Intel Pathfinding and Architecture Group.
Collaborator Contribution We have performed detailed optimisation of QCD codes (Wilson, Domain Wall, Staggered) on Intel many core architectures. We have investigated the memory system and interconnect performance, particularly on Intel's latest interconnect hardware called Omnipath. We found serious performance issues and worked with Intel to plan a solution and this has been verified and is available as beta software. It will reach general availability in the Intel MPI 2019 release, and allow threaded concurrent communications in MPI for the first time. A joint paper on the resolution to this was written with the Intel MPI team, and the application of the same QCD programming techniques to machine learning gradient reduction was applied in the paper to the Baidu Research all reduce library, demonstrating a 10x gain for this critical step in machine learning in clustered environments. We are also working with Intel verifying future architectures that will deliver the exascale performance in 2021.
Impact We have performed detailed optimisation of QCD codes (Wilson, Domain Wall, Staggered) on Intel many core architectures. We have investigated the memory system and interconnect performance, particularly on Intel's latest interconnect hardware called Omnipath. We found serious performance issues and worked with Intel to plan a solution and this has been verified and is available as beta software. It will reach general availability in the Intel MPI 2019 release, and allow threaded concurrent communications in MPI for the first time. A joint paper on the resolution to this was written with the Intel MPI team, and the application of the same QCD programming techniques to machine learning gradient reduction was applied in the paper to the Baidu Research all reduce library, demonstrating a 10x gain for this critical step in machine learning in clustered environments. This collaboration has been renewed annually in 2018, 2019, 2020. Two DiRAC RSE's were hired by Intel to work on the Turing collaboration.
Start Year 2016