ExCALIBUR HES: Exascale Data Testbed for Simulation, Data Analysis & Visualisation

Lead Research Organisation: University of Cambridge
Department Name: Chemistry

Abstract

his proposal is phase 3, i.e., a continuation of 2 previous Cambridge ExCALIBUR H&S funded projects under the same name "Exascale Data Testbed for Simulation, Data Analysis & Visualisation" The proposal builds on these 2 previous ExCALIBUR H & S projects to design, build and make available to the ExCALIBUR application community state of the art solid state I/O prototype platforms that can be used to understand and characterise novel emerging solid state file system technologies and help develop UK ExCALIBUR application capable of high I/O performance needed to scale at exascale.

This phase 3 proposal will extend the service of the phase 1 and 2 test beds for another 2 years and add functionality and range of files systems provided. This is enabled by new software tools and additional storage hardware. This additional functionality is seen to be needed in light of extensive engagement with ExCALIBUR projects Excalidata and Excalistore PI'd by Bryan Lawrence and direct engagements with the Met office, UKAEA , DiRAC, IRIS and SKA SRC projects. The proposal is split into 2 separate sections: -
A) I/O profiling software tools at application and system level
B) Additional hardware for I/O testbeds

A) I/O profiling software tools
Two commercial I/O profiling tools (Altaire's Mistral & Breeze) will be procured with capital funding from this call and installed on the operational Cambridge CSD3 HPC system and also on dedicated ExCALIBUR IO testbeds resulting from the phase 1 & 2 Cambridge ExCALIBUR H$S awards. Staff time contributed at 2 FTE is funded by the Cambridge Open Exascale Lab. Cambridge have already trialed these tools via evaluation licences from Altair and found them very useful.

These products show us the two views of HPC I/O we need to understand, one view from the microscopic application level the other view from the macroscopic system wide level. When we put these together, we should have a much better picture of what is going on with HPC I/O, providing the tools we need to understand the behavior of the different ExCALIBUR prototype file systems and then how to optimise applications we choose to run there.

B) Additional hardware for data testbeds - During community engagement with Excalistore, Excalidata, DiRAC, IRIS and SKS SRC community we see a strong need to develop and test high performance object storage for HPC using both SSD and spinning disk platforms. We also need a small spinning disk testbed to evaluate the same candidate file systems we build on the current solid-state testbed developed in the phase

Key deliverables - Technical reports on programme of works items 1-4, these will be fashioned into white papers and published through industrial partners Dell/Intel via Exascale Lab and also written up as academic papers. We have received a lot of interest in the approach of using a large-scale apples for apples NVMe testbed with different file systems, the work is highly publishable. The work will also produce valuable test beds, analysis systems and skilled people for UK HPC user communities mentioned above to test and improve I/O of key UK candidate exascale codes.

Publications

10 25 50