Efficient Cross-Domain DSL Development for Exascale

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

Developing scientific software, for example for climate modeling or medical research, is a highly challenging task. Domain scientists are often deeply involved in low-level programming details just to make their code run sufficiently fast. These tedious, but important, optimization steps significantly reduce the productivity of scientists.

Domain specific languages (DSLs) revolutionize the productivity of domain scientists by enabling them to focus on scientific questions rather than making their code run fast. Sophisticated DSL compilers automatically generate high-performance code from domain-specific high-level problem descriptions.

While there are individual successes, the existing landscape of DSLs is scattered and the reuse of software components in DSL compiler implementations is limited as traditionally DSL compilers are built in isolation. This results in high development costs of new DSLs and prevents many DSLs from ever achieving a level of maturity and sustainability that enables uptake by the scientific community.

This project revolutionizes the design of DSL compiler implementations by leveraging the breadth and cross-industry support of the MLIR compiler and Python ecosystems. Python is the tool of choice for application developers in many domains, such as machine learning, data science, and - we believe - an important component of the future of High Performance Computing software. This project establishes MLIR as a common representation for code at multiple levels of abstraction in DSL compiler development. DSLs embedded in various host languages, including Python and Fortran, will be easily built on top of MLIR. Instead of building DSL compilers as isolated monolithic towers, our research will build a toolbox that enables developers to build DSLs using a rich ecosystem of shared intermediate representations IRs and optimizations.

This project evaluates, drives, and demonstrates the DSL design toolbox to build the next generation of DSLs for Seismic and Climate Modelling as well as Medical imaging. These will share common software components and make them available for other DSLs. An extensive evaluation will show the scalability of DSL software towards exascale.

Finally, this project investigates how future disruptors, including artificial intelligence, data science, and on-demand HPC-as-a-service, will shape and influence the next generations of high performance software. This project will work towards deeply integrating modern interactive data analytics and machine learning methods from the Python ecosystem with high-performance scientific code.

Publications

10 25 50