SERT: Scale-free, Energy-aware, Resilient and Transparent Adaptation of CSE Applications to Mega-core Systems

Lead Research Organisation: Queen's University Belfast
Department Name: Sch of Electronics, Elec Eng & Comp Sci

Abstract

Moore's Law and Dennard scaling have led to dramatic performance increases in microprocessors, the basis of modern supercomputers, which consist of clusters of nodes that include microprocessors and memory. This design is deeply embedded in parallel programming languages, the runtime systems that orchestrate parallel execution, and computational science applications.

Some deviations from this simple, symmetric design have occurred over the years, but now we have pushed transistor scaling to the extent that simplicity is giving way to complex architectures. The absence of Dennard scaling, which has not held for about a decade, and the atomic dimensions of transistors have profound implications on the architecture of current and future supercomputers.

Scalability limitations will arise from insufficient data access locality. Exascale systems will have up to 100x more cores and commensurately less memory space and bandwidth per core. However, in-situ data analysis, motivated by decreasing file system bandwidths will increase the memory footprints of scientific applications. Thus, we must improve per-core data access locality and reduce contention and interference for shared resources.

Energy constraints will fundamentally limit the performance and reliability of future large-scale systems. These constraints lead many to predict a phenomenon of "dark silicon" in which half or more of the transistors on each chip must be powered down for safe operation. Low-power processor technologies based on sub-threshold or near-threshold voltage operation are a viable alternative. However, these techniques dramatically decrease the mean time to failure at scale and, thus, require new paradigms to sustain throughput and correctness.

Non-deterministic performance variation will arise from design process variation that leads to asymmetric performance and power consumption in architecturally symmetric hardware components. The manifestations of the asymmetries are non-deterministic and can vary with small changes to system components or software. This performance variation produces non-deterministic, non-algorithmic load imbalance.

Reliability limitations will stem from the massive number of system components, which proportionally reduces the mean-time-to-failure, but also from the component wear and from low-voltage operation, which introduces timing errors. Infrastructure-level power capping may also compromise application reliability or create severe load imbalances.

The impact of these changes on technology will travel as a shockwave throughout the software stack. For decades, we have designed computational science applications based on very strict assumptions that performance is uniform and processors are reliable. In the future, hardware will behave unpredictably, at times erratically. Software must compensate for this behavior.

Our research anticipates this future hardware landscape. Our ecosystem will combine binary adaptation, code refactoring, and approximate computation to prepare CSE applications. We will provide them with scale-freedom - the ability to run well at scale under dynamic execution conditions - with at most limited, platform-agnostic code refactoring. Our software will provide automatic load balancing and concurrency throttling to tame non-deterministic performance variations. Finally, our new form of user-controlled approximate computation will enable execution of CSE applications on hardware with low supply voltages, or any form of faulty hardware, by selectively dropping or tolerating erroneous computation that arises from unreliable execution, thus saving energy. Cumulatively, these tools will enable non-intrusive reengineering of major computational science libraries and applications (2DRMP, Code_Saturne, DL_POLY, LB3D) and prepare them for the next generation of UK supercomputers. The project partners with NAG a leading UK HPC software and service provider.

Planned Impact

The project will achieve commercial impact through the development of production-level Computational Science and Engineering Software that will catalyse performance and productivity in applications within the EPSRC remit; industrial engagement with UK and international stakeholders, in particular through membership of project partners in the European Technology Platform for HPC (ETP4HPC); exploration of the potential to receive follow-on funding and create spin-out companies with instruments such as the Impact Account Acceleration at Queen's Belfast; and the organisation of an industrial workshop. The project will achieve further economic impact through better utilisation and reduction of the total cost of ownership of the major UK supercomputing infrastructures and improved productivity in sectors of the UK high-technology economy that depend on HPC.

The project will achieve academic impact by publishing results in the very best journals and conferences across the areas of high performance computing, computational science, scientific computing, programming languages and computer architecture. All publications will follow Green or Gold open access routes, the former leveraging institutional publication repositories and the latter institutional funding. All software developed in the project will be open-sourced, with associated training provision in the form of tutorials and short modules. Further academic impact will be achieved via exchange visits and demonstration sessions with project partner NAG, ClusterVision, and other HPC vendors and groups in the UK.

Societal impact will be achieved through prominent presence in social media (Web 2.0, LinkedIn, Twitter and YouTube Channels) to disseminate the results to professionals and the general public. Further societal presence will be achieved through distribution of news articles, press releases, and video presentations. The project will develop software technologies for emerging many-core systems, a skill which is highly marketable.

The project follows a comprehensive software management plan: It will produce three software outputs (Adaptor, RightSizer, Approximator), licensed under GPL. The tools will be developed, tested and maintained in a GITlab software repository, with the associated GIT revision control system hosted by Queen's Belfast and shared between the project partners. The software will be user-level and will not require interventions to the host operating system, which would prevent its deployment on the target systems (ARCHER, BlueJoule, NextScale, Titan). It will be based on the GNU stack for maximum portability across current and future platforms. The software will support and be compatible with widely used parallel programming languages (MPI, OpenMP, OpenCL) and libraries (MAGMA, PLASMA, ATLAS). Source code changes in MPI, OpenMP and OpenCL, where needed, will be feasible with the adoption of open-source implementations of them (e.g. OpenMPI, PoCL, GOMP).

The software will be released to and hosted for the public by Queen's Belfast during the course of the project, and later by STFC for production use on the targeted supercomputers. The GITlab repository that will house the software at QUB is well tested and already provides support for code development, maintenance, revision control and testing in nine large-scale software development projects (EPSRC, FP7/H2020, and industry-lead), involving 28 research groups in the UK, Germany, Switzerland, Sweden, Greece, Austria, Ireland and the US, and totalling hundreds of KLOC in C/C++ parallel code. We will use Doxygen for formal code documentation, DokuWiki for informal documentation and discussion among developers, and BugZilla for bug tracking. We will use nightly builds and regression tests. A permanent research engineer funded from Queen's will undertake the role of software maintenance and quality control manager and will be responsible for maintaining the highest coding and documentation
 
Description In the context of SERT we developed REFINE, a novel framework that addresses these limitations, performing FI in a compiler backend. Our approach provides the portability and efficiency of compiler-based FI, while keeping accuracy comparable to binary-level FI methods. We demonstrated our approach in 14 HPC programs and showed that, due to our unique design, its runtime overhead is significantly smaller than state-of-the-art compiler-based FI frameworks, reducing the time for large FI experiments.We also developed a significant codebase of tools for improving the scaling of parallel programs. Specifically, we developed
SCALO, a tool that increases throughput of jobs on supercomputer nodes. SCALO optimizes resource allocation of parallel
programs running concurrently on the same node, by minimising contention and adapting resource allocation to the
scalability potential of co-runners. A particular strength of our tool is that it can be deployed to existing supercomputer
infrastructure, without disrupting the pre-deployed installations. The initial evaluation using benchmarks of HPC application
proxies show promising results and we are expanding its use to large-scale application.

Approximation is an emerging research method for speeding up execution by trading computational accuracy for performance.
We choose to extend the widely used OpenMP parallel language for including constructs to express approximation opportunities
on parallel computations. We develop those extensions on already existing, industrial quality tools, including a compiler
(Clang/LLVM) and parallel runtime (Intel OpenMP runtime). Through our extensions, HPC developers have a structured way to
include approximation in parallel programs and dictate how this is implemented at the execution runtime. For example, the
developer annotates computational tasks as amenable to approximation and configures the runtime to perform those computations
with reduced accuracy or even completely drop them for aggressive speed optimisation. We have demonstrated the applicability
of our approximation techniques in numerical kernels and we are in the process of evaluating them to on large-scale applications.
Exploitation Route We make available all the tools we develop to our research partners which are HPC application and numerical library developers.
Also, we intent to release our software to the wider scientific community while fine-tuning it for usability and performance,
using invaluable feedback from our partners and domain experts.

The vision for our SCALO tool is to be part of the system services provided by supercomputer facilities. Its usage will
enable users to co-locate jobs on nodes for increasing utilisation and throughput of supercomputer installations. Our
approximation framework presents a robust, ready-to-use solution by extending existing standards (OpenMP) used for programming HPC
applications. This enables developers to include approximation in their applications.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description The findings of the project are actively been used to inform software engineering practices and improve software productivity as well as resilience of production strength software in two supercomputing centres in the UK (STFC) and the US (LLNL).
First Year Of Impact 2017
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Description EU Horizon2020 Programme: AllScale: An Exascale Programming, Multi-objective Optimisation and Resilience Management Environment Based on Nested Recursive Parallelism.
Amount € 438,578 (EUR)
Funding ID 671603 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 10/2015 
End 09/2018
 
Description EU Horizon2020 Programme: ECOSCALE: Energy-Efficient Heterogeneous Computing at Scale
Amount € 696,750 (EUR)
Funding ID 671632 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 10/2015 
End 09/2018
 
Description EU Horizon2020 Programme: UniServer Project
Amount € 663,625 (EUR)
Funding ID 687628 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 02/2016 
End 01/2019
 
Description Horizon2020 Programme
Amount € 5,999,510 (EUR)
Funding ID H2020-732631 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 01/2017 
End 12/2020
 
Description Royal Society Wolfson Research Merit Award: Principles and Practice of Near-Data Computing
Amount £50,000 (GBP)
Funding ID WM150009 
Organisation The Royal Society 
Sector Charity/Non Profit
Country United Kingdom
Start 09/2015 
End 08/2020
 
Description SFI-DEL Investigators Programme: Meeting the Challenges of Heterogeneous and Extreme Scale Parallel Computing
Amount £521,947 (GBP)
Funding ID 14/IA/2474 
Organisation Science Foundation Ireland (SFI) 
Sector Charity/Non Profit
Country Ireland
Start 09/2015 
End 08/2020
 
Description Collaboration with IBM on disaggregated memory technologies and near-data computing 
Organisation IBM
Country United States 
Sector Private 
PI Contribution Our research team has contributed methods to manage data caching and placement on disaggregated memory architectures with near-data processing elements.
Collaborator Contribution IBM has contributed novel remote memory server infrastructures and near-data acceleration technologies.
Impact Materialised through an industrial placement of QUB research staff, this partnership is exploring designs to substantially improve the energy-efficiency of large memory systems, via the use of disaggregation of memory, RDMA-based networking to remote memory devices and near-data accelerators for in-situ, in-memory analytics.
Start Year 2015
 
Description Collaboration with Maxeler on integrating dataflow accelerators in Big Data software stacks 
Organisation Maxeler Technologies Inc
Department Maxeler Technologies
Country United Kingdom 
Sector Private 
PI Contribution Integration of Maxeler's dataflow engines into the Spark, Storm and other Big Data software stacks, in collaboration with Maxeler Technologies and STFC Hartree.
Collaborator Contribution Programming APIs for Maxeler dataflow accelerators.
Impact No outputs yet, extensions of Spark and Storm with streaming APIs using Maxeler dataflow engines are currently under design.
Start Year 2016
 
Description Collaboration with NHS (Belfast HSCT) on real-time analytics of ICU patient data 
Organisation Royal Victoria Hospital, Belfast
Country United Kingdom 
Sector Hospitals 
PI Contribution A real-time analytics appliance (micro-server plus in-memory data analytics software) for analysing continuously respiratory data of ICU Patients, with the objective to regulate oxygen intake and prevent lung injury.
Collaborator Contribution Analytical algorithms and infrastructure support at Royal Victoria Hospital, Belfast.
Impact Appliance operating and automatically detecting potential lung injury emergencies at Royal Victoria Hospital ICU.
Start Year 2015
 
Description NVTV Interview on Superocmputing 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact Interview in NVTV's Behind the Science program on Supercomputing as a technology with impact on our everyday lives.
Year(s) Of Engagement Activity 2015
URL http://www.nvtv.co.uk/shows/behind-the-science-dimitrios-nikolopoulos/