Northern Intensive Computing Environment

Lead Research Organisation: Durham University
Department Name: Vice Chancellor's Office

Abstract

We propose to establish a new national Tier-2 HPC service ("NICE19") by purchasing and operating a novel architecture supercomputer, based upon IBM Power9 CPU and Nvidia Volta GPUs. This architecture is used in both #1 and #2 of the most recent list of the Top500 supercomputers in the world, within the USA government lab-based SUMMIT and SIERRA supercomputers. Currently this architecture is not widely available in the UK, so by establishing a national Tier-2 facility we will add significant diversity and capability to the UK e-infrastructure. This architecture supports memory coherence between the GPU and CPU and a hierarchy of interconnects to allow effective distributed GPU use, extending problem sizes that can tackled beyond that of other GPU-accelerated architectures, increasing data sizes for accelerated simulation and analysis codes, and reducing the 'time to science' for a range of 'hard' problems.

The purpose of this facility is to enable new science across many different disciplines. Our primary focus will be on experimental users who generate large data sets that need analysis (e.g. cyro-EM facilities) and for modellers who use machine learning. By supporting and bringing together these two communities, there will be many opportunities for new and exciting science. This will bring HPC to many communities who have not engage with HPC before now, with many benefits in getting better value out of experiments and existing datasets. In addition, existing investments in network infrastructure for the DiRAC consortium of STFC users will be leveraged to connect the facility to experimental facilities and other Tier-2 centres for optimal data flows between sites.

This proposal is led by the N8 Centre of Excellence in Computationally Intensive Research (N8-CIR) and supported by the N8 partnership. This will provide 8 FTE of Research Software Engineers (RSEs) to support these communities, port and optimize their software for this new platform and train users, which will be essential to maximize the benefit of this novel facility. These codes will then be able to run efficiently on the facility, supported by trained users. The architectural similarity with SUMMIT and SIERRA will also provide a route to exascale computing. This proposal will therefore have a long-lasting impact, beyond the lifetime fo this facility.

Planned Impact

As well as the academic impact detailed above, the NICE19 will also generate societal and economic impacts through its support for the key scientific areas.

For instance, the machine learning community is having an economic benefit by working with computer chip designers such as Intel to co-design better computer architectures for data-centric computing; generating improved solar-power production predictions for the National Grid; and in the future, improved machine learning models with full uncertainty quantification (built on NICE19) can then be deployed in mobile applications for more robust predictions. The ML community is also having a societal impact, with more accurate models for the clinical diagnosis such as cystic fibrosis; analysing air pollution sensors to help develop better transport policies which enhance the quality of life for city dwellers; and shaping how GDPR regulations will be implemented fairly and clearly.

The better analysis of experimental data will make experiments more cost efficient, which given the high running costs of large facilities (e.g. the Diamond Light Source costs £40m p.a. to operate) will have a large economic impact. These facilities also study problem of economic significance, such as improving the performance of Li-ion batteries and developing improved perovskite-based solar cells. The science generated can also have significant societal impact, such as recent work on the structure of viruses and vaccine development, or the structure of moon rocks from the Apollo space missions.

Finally, the support for physical simulation, in the different application domains, will bring additional benefits above and beyond enhancing the machine learning and experimental analysis. Both biomolecular simulation and computational fluid dynamics have a well-established track-record of societal and economic impact, e.g. through the design of new drugs and more efficient wind turbine blades. There are some impact case studies for these topics at https://www.n8research.org.uk/research-focus/legacy-programmes/n8-high-performance-computing/ which reflect the previous N8-HPC activity in this area.

The computational support provided by NICE19 will allow these existing efforts to be extended, and for new techniques and applications to be developed. With new research communities being supported and new codes being ported to this facility, the benefits will surely increase with time, and we will be responsive to the needs of these new communities through our NICE19 User Group. We will seek to maximize these benefits through our community building and supporting activities (facilitated by N8-CIR) and through our Annual Conference.

Publications

10 25 50