📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Expanding, supporting, and training the Sulis tier 2 user community

Lead Research Organisation: University of Warwick
Department Name: Physics

Abstract

Powerful high performance computing (HPC) facilities can transform the scope and ambition of research which relies on simulation, data processing and analysis. However the number of research communities which benefit from investment in expensive supercomputing hardware is limited. Training is often focussed on technologies which developed as the means to perform very large simulations at high fidelity (capability computing), rather than running smaller scale calculations or analyses on large numbers of inputs concurrently (high throughput or ensemble computing). In some research communities, the de-facto standard analysis software or programming language is not well served by traditional HPC documentation and training. In others (such as the arts and humanities) there is the potential to benefit from HPC, but no awareness of (and often access to) large facilities. This is confounded by there being no tradition of research computing training in those disciplines.

Even within HPC-literate research communities, it can be the case that only parts of a scientific computing workflow are well served by traditional HPC access models requiring movement of large amounts of data between local facilities and the HPC facility. This is costly in terms of both time and energy.

This proposal aims to address some of these barriers in the context of the Sulis tier 2 facility, which focusses on high throughput and ensemble computing. We will resource additional support and hardware for the facility to address some of the challenges to exploitation of the facility. It targets research communities beyond those originally envisioned, particularly the metagenomics community who often require more memory per calculation than can be served by most HPC hardware. We will also pilot training and demonstrator projects with the observational astronomy community at Warwick (data processing), and with the History of Art department (computer vision). HPC training for uses of the R programming language will also be developed and piloted. In each case we will use Sulis for users to gain hands-on experience with the technology.

The issue of reproducibility in HPC-intensive calculations is a recognised problem in the community. Published research which relies on expensive HPC calculations is seldom reproduced due to that expense. In other scientific computing contexts, ReproHack events (https://reprohack.org/) are gaining traction. These involve teams of young researchers attempt to reproduce data from published research papers as a valuable way to teach best practise in research reporting and reproducibility. Working with a founding member of the ReproHack project, we will trial a HPC ReproHack using the resources provided by Sulis. This trial event will involve PhD students at the university or Warwick and will be used to inform the design of future HPC-intensive ReproHack activities.

Publications

10 25 50
 
Description The high memory (4TB) servers have been added to the Sulis tier 2 facility and are now in regular use. Note that publications reported against this award are those where the authors have reported that use of these servers was key to the research. Publications which use the Sulis facility in general (not the 4TB servers) are not reported here since the grant funding the main facility does not appear in research fish as an award, and Sulis does not appear in researchfish as a facility users can associate with their own research awards.

The proposed pilot training activities were carried on March 2022 as proposed.
Exploitation Route The format for an HPC ReproHack and the outcomes of the activity are available at reprohack.org. This pilot can be adopted and taken forward by others looking to run similar activities for students at their own institution. The parallel R training material as used in the pilot training is available at https://github.com/sulis-hpc/R-hpc for others to modify and re-use for their own training events.

The 4TB servers are available to all users of the Sulis service until at least March 2025.
Sectors Digital/Communication/Information Technologies (including Software)

Education

 
Description HPC ReproHack
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
URL https://www.reprohack.org/event/14/
 
Description Training material on parallel R for ensemble computing
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
URL https://github.com/sulis-hpc/R-hpc