Verifiably correct concurrency abstractions

Lead Research Organisation: Brunel University London
Department Name: Computer Science

Abstract

Multi-core computing architectures have become ubiquitous over the last decade. This has been driven by the demand for continual performance improvements to cope with the ever-increasing sophistication of applications, combined with physical limitations on chip designs, whereby speed-up via higher clock speeds has become infeasible. The inherent parallelism that multi-core architectures entail offers great technical opportunities, however, exploiting these opportunities presents a number of technical challenges.

To ensure correctness, concurrent programs must be properly synchronised, but synchronisation invariably introduces sequential bottlenecks, causing performance to suffer. Fully exploiting the potential for concurrency requires optimisations to consider executions at low levels of abstraction, e.g., the underlying memory model, compiler optimisations, cache-coherency protocols etc. The complexity of such considerations means that checking correctness with a high degree of confidence is extremely difficult. Concurrency bugs have specifically been attributed to disasters such as a power blackout in north-eastern USA, Nasdaq's botched IPO of Facebook shares, and the near failure of NASA's Mars Pathfinder mission. Other safety-critical errors have manifested from using low-level optimisations, e.g., the double-checked locking bug and the Java Parker bug.

This project improves programmability of concurrent programs through the use of scalable atomicity abstractions such as TM and concurrent objects that make low-level optimisations available to general application programmers. Operations of such objects are highly concurrent (which improves efficiency), yet manage synchronisation on behalf of a programmer to provide an illusion of atomicity. Thus, by using TM, the focus of a programmer switches from what should be made atomic, as opposed to how atomicity should be guaranteed. This means concurrent systems can be developed in a layered manner (enabling a separation of concerns).

The attractive set of features that TM promises means that TM implementations are increasingly being incorporated into mainstream systems (hardware and software). Since the adaptation of transactions from database theory in the mid 1990s, software TM implementations are now available for all major programming languages. Recent advances include experimental features in compilers such as G++ 4.7 that directly enable compilation of transactional code; standardisation work to include TM within C++ is ongoing. There is extensive research interest in hybrid TM within both academia and industry to make best use of, for example, TM features in Intel's Haswell/Broadwell and IBM's Blue Gene/Q processors.

The high level of complexity, yet wide-scale applicability of TM means that implementations must be formally verified to ensure dependability and reliability. Overall, we will improve the dependability, performance, and flexibility of TM implementations.

Planned Impact

The global benefits of multi-core computing are clear - better performance, and more sophisticated applications. Correct concurrent algorithms are key to ensuring these benefits are realised. That is, without concurrent algorithms, the performance benefits that multi-core processors can offer are diminished, and in turn, the range of applications that can take advantage of these benefits is limited (to those where high reliability is not a concern). Therefore, there is a large international effort in both academia and industry aimed at developing the foundational and theoretical underpinnings of concurrent systems, including improved abstractions, tool support, engineering solutions etc.

The aim of this travel grant is to further work that will help (1) deliver an H2020 bid to train PhD students across a range of topics on verification and testing of concurrent systems; (2) develop foundations for TM correctness that takes real hardware into account, ultimately leading to research proposal in collaboration with industry; and (3) understand the applicability of TM research in the important upcoming area of blockchains, which feeds into a third research proposal.

Our work seeks to make an impact by enabling such algorithms to be verified correct. To date, there has been little work in this area, or on the memory models of real processors. Because of their subtlety, it can be exceptionally difficult to isolate any faults in correctness through testing alone, and their potential widespread use means that verified correctness (particularly at processor level) will be important.

Work on verifying correctness of concurrent algorithms will help translate the potential benefits of these developments into real, tangible ones. Consequently, the ultimate economic implications of this work are enormous. Outside of academia, we expect our work to be useful to several types of companies:

- Hardware manufacturers, e.g., IBM, Intel, AMD, and ARM, who require better-performing software to enable introduction of additional cores in the processors they produce. Without software-side improvements, end-users will not benefit when increasing the number of cores.

- Programming language developers, e.g., Java (Oracle), C++, Scala, Python, who use predefined libraries to provide high-end functionality to programmers in a layered re-usable manner.

- Application developers, e.g., Mozilla (web applications), Valve Software (video gaming), Codeplay Software (SoC tools), etc., who require reliable high-performance libraries to take advantage of the available multi-core hardware.

- Formal verification companies, who will benefit from the techniques we use to verify novel concurrent data structures for relaxed memory. Of particular interest will be the mechanisations we develop in Isabelle.

- Large-scale (cloud-based) developers, e.g., Google, Facebook and Amazon, who are interested in correctness conditions such as causal and eventual consistency in the databases they use. There is a direct link between the correctness conditions we study and such notions of consistency.

Ultimately the impact of this will be to enhance the correctness of software using such algorithms, which, with multi-core technology and implementation of specific TM models, could see very widespread usage.

Publications

10 25 50
 
Description Further insights gained on a broad range of topics, which has directly led to three new projects around concurrent systems, security and persistent memory.
Exploitation Route This is a personal travel grant. It has led to further funding from the ESPRC, ARC and VeTSS, as well as an H2020 project (under review).
Sectors Digital/Communication/Information Technologies (including Software)

 
Description Design and verification of correct, efficient and secure concurrent systems
Amount $460,000 (AUD)
Funding ID DP190102142 
Organisation Australian Research Council 
Sector Public
Country Australia
Start 01/2019 
End 12/2021
 
Description FACT: Faithful Composition of Trust
Amount £204,000 (GBP)
Organisation National Cyber Security Centre 
Sector Public
Country United Kingdom
Start 10/2019 
End 03/2021
 
Description Persistent Safety and Security
Amount £78,000 (GBP)
Organisation The Research Institute on Verified Trustworthy Software Systems 
Sector Academic/University
Country United Kingdom
Start 10/2019 
End 03/2023
 
Description Verifiably Correct Transactional Memory
Amount £397,680 (GBP)
Funding ID EP/R032556/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 09/2018 
End 09/2021