Future-proof massively-parallel execution of multi-block applications

Lead Research Organisation: University of Southampton
Department Name: Faculty of Engineering & the Environment

Abstract

For many years, increasing the clock frequency of microprocessors has led to steady improvements in performance of computer applications. This gave an almost free performance boost to the speed of applications without having to re-write software for each new generation of processors. However, increasing the performance of processors in this manner led to an unsustainable increase in energy consumption. Thus, to gain higher performance chip developers now rely on multiple cores operating in parallel. The latest CPUs have up to 10 cores, each with a vector unit producing up to 8 single precision floating point results per clock cycle, while the latest graphics processors (GPUs) have up to 2688 much simpler cores operating in groups of 32.

This move into manycore computing has led to considerable hardware innovation, and it is likely that the next 10 years will see further rapid evolution in computer architectures. This poses huge challenges to application developers who naturally wish to concentrate on their engineering and scientific applications and how best to model them, without having to worry about the details of modern computer architectures. To address this, there are a range of efforts within scientific computing to develop high-level software packages or frameworks so that the application developer can specify what they want to be computed at a high level, and then the package takes care of the implementation details.

Building on prior EPSRC-funded research to develop a framework called OP2 for unstructured grid applications, this proposal aims to develop a future-proof extension called OPS to handle the needs of multi-block structured grid applications. Developers' applications can be written in FORTRAN or C, using a carefully-designed application programming interface (API), and then OPS generates customised code for the implementation on different hardware target platforms.

As well as customising for the different hardware, two other optimisation approaches will be adopted. One is the use of ``tiling'' to overlap the execution of parallel loops which are usually executed sequentially. This improves both performance and energy efficiency by reusing data within the cache, cutting down on the number of times data is moved between the processor and the main memory. This is something which is becoming increasingly important on modern architectures because the energy cost and time taken for data movement is much greater than for floating point operations.

The other optimisation is the use of run-time optimisation for applications which execute for a long time. The backend implementations are parameterised, with parameters controlling aspects such as the number of threads in a thread block, or the size of a ``tile'' in the tiling optimisation. The optimal values for these parameters are not known a priori, and it could significantly affect the performance. By dynamically varying the values, and timing the consequential changes in performance, we can implement heuristics to iteratively improve the parameter values during the execution.

The new OPS framework will be assessed, both for performance and ease-of-use, by applying it to two important academic CFD codes, ROTOR developed at Bristol by Prof. Chris Allen, and SBLI developed by at Southampton by Prof. Neil Sandham. As well as being important codes in their own right, these are also representative of the needs of other codes within CCP12 (Computational Engineering), the UK Turbulence Consortium, and the UK Applied Aerodynamics Consortium.

Publications

10 25 50
 
Description Software developed has enabled us to assess the energy-efficiency of numerical algorithms. The open source software OpenSBLI has been released. version 2 has now been released (2019) We have alsp used the software for advanced undergraduate teaching, enabling 100 student to date to have exposure to a research computer code.
First Year Of Impact 2019
Sector Aerospace, Defence and Marine,Energy
Impact Types Societal

 
Description Horizon 2020 ExaFLOW project
Amount € 326,000 (EUR)
Funding ID 671571 
Organisation European Union 
Sector Public
Country European Union (EU)
Start  
 
Title OpenSBLI 
Description OpenSBLI is a Python-based modelling framework that is capable of expanding a set of differential equations written in Einstein notation, and automatically generating C code that performs the finite difference approximation to obtain a solution. This C code is then targetted with the OPS library towards specific hardware backends, such as MPI/OpenMP for execution on CPUs, and CUDA/OpenCL for execution on GPUs. 
Type Of Material Improvements to research infrastructure 
Provided To Others? No  
Impact Open source software released in Nov 2016 
URL https://opensbli.github.io