IRIS Digital Asset Proposal - DASKhub Salary Buy Out

Lead Research Organisation: University of Cambridge
Department Name: Chemistry

Abstract

This proposal is to create a general purpose tool for scaling python based applications across many nodes of a single cluster, enabling in-situ analytics and pre- and post-processing. This is relevant in many application areas of science such as in monte-carlo methods where many runs require analysis to produce a final result or in engineering modelling where there pre pre-processing step to go from a CAD to a 2 or 3-D mesh followed by post processing visualisation of the results of thermos mechanical modelling. With many new codes being built using Python, Pandas, NumPy, etc these tools have scalability limitations beyond a single node. When presented with large data sets that don't fit into the memory of a single node, applications will reach an IO bottleneck. Several tools have been built to overcome some or all of these limitations such as DASK, RAY, MODIN and RAPIDS.

This proposal builds on previous work on the DEISA prototype which used DASK to provide in-situ post processing. Within this proposal we look at extending DEISA by integrating RAPIDS. By combining DASK and RAPIDS we will provide a framework suitable for modern hybrid architectures consisting of combined GPU and CPU, regardless of the GPU supplier.

Publications

10 25 50