Verifiably Correct Transactional Memory.

Lead Research Organisation: University of Sheffield
Department Name: Computer Science

Abstract

Multi-core computing architectures have become ubiquitous over the last decade. This has been driven by the demand for continual performance improvements to cope with the ever-increasing sophistication of applications, combined with physical limitations on chip designs, whereby speedup via higher clock speeds has become infeasible. The inherent parallelism that multi-core architectures entail offers great technical opportunities, however, exploiting these opportunities presents a number of technical challenges.

To ensure correctness, concurrent programs must be properly synchronised, but synchronisation invariably introduces sequential bottlenecks, causing performance to suffer. Fully exploiting the potential for concurrency requires optimisations to consider executions at low levels of abstraction, e.g., the underlying memory model, compiler optimisations, cache-coherency protocols etc. The complexity of such considerations means that checking correctness with a high degree of confidence is extremely difficult. Concurrency bugs have specifically been attributed to disasters such as a power blackout in north eastern USA, Nasdaq's botched IPO of Facebook shares, and the near failure of NASA's Mars Pathfinder mission. Other safety-critical errors have manifested from using low-level optimisations, e.g., the double-checked locking bug and the Java Parker bug.

This project improves programmability of concurrent programs through the use of transactional memory (TM), which is a concurrency mechanism that makes low-level optimisations available to general application programmers. TM is an adaptation of transactions from databases. TM operations are highly concurrent (which improves efficiency), yet manage synchronisation on behalf of a programmer to provide an illusion of atomicity. Thus, by using TM, the focus of a programmer switches from what should be made atomic, as opposed to how atomicity should be guaranteed. This means concurrent systems can be developed in a layered manner (enabling a separation of concerns).

The attractive set of features that TM promises means that TM implementations are increasingly being incorporated into mainstream systems (hardware and software). Since the adaptation of transactions from database theory in the mid 1990s, software TM implementations are now available for all major programming languages. Recent advances include experimental features in compilers such as G++ 4.7 that directly enable compilation of transactional code; standardisation work to include TM within C++ is ongoing. There is extensive research interest in hybrid TM within both academia and industry to make best use of, for example, TM features in Intel's Haswell/Broadwell and IBM's Blue Gene/Q processors.

The high level of complexity, yet wide-scale applicability of TM means that implementations must be formally verified to ensure dependability and reliability.

This project addresses some of the main challenges surrounding TM, and takes the key steps necessary to facilitate wide-scale adoption. Namely, we deliver theoretical advances in our understanding of TM correctness; methodological advances in verification techniques for TM; and pragmatic advances via the development of application-aware TM designs. Verification tools will support each of these steps. We therefore set the following objectives:

O1. Develop foundations for TM correctness (atomicity and interaction guarantees) under different execution models and relate these to client correctness.
O2. Mechanically verify correctness of TM implementations, and develop principled proof techniques.
O3. Design TM implementations that provide better performance under current and future multi-core hardware.
O4. Develop tool support to simplify mechanised verification of TM and automated checking of client programs that use them.

Overall, we will improve the dependability, performance, and flexibility of TM implementations.

Planned Impact

The global benefits of multi-core computing are clear - better performance, and more sophisticated applications. Correct concurrent algorithms are key to ensuring these benefits are realised. That is, without concurrent algorithms, the performance benefits that multi-core processors can offer are diminished, and in turn, the range of applications that can take advantage of these benefits is limited (to those where high reliability is not a concern). Therefore, there is a large international effort in both academia and industry aimed at developing the foundational and theoretical underpinnings of concurrent systems, including improved abstractions, tool support, engineering solutions etc.

Transactional Memory (TM) implementations have been shown to provide programmer-friendly concurrency abstractions, simultaneously achieving high performance and system safety. Software TM (STM) is already being implemented in programming language libraries, while hardware TM (HTM) is available in modern hardware. Hybrid TM (HyTM), which provides software-based fallback mechanisms for HTM whenever the inherent hardware limitations are reached, is also becoming increasingly available.

In this project, at a technical level, we will (1) develop foundations for TM correctness; (2) mechanically verify STM and HyTM algorithms using a combination of techniques; (3) develop methods for designing new algorithms; (4) develop tool support to simplify mechanised verification and deployment of TMs.

Work on verifying correctness of these STM and HyTM algorithms will help translate the potential benefits of these developments into real, tangible ones. Consequently, the ultimate economic implications of this strand of research are enormous.

Our work seeks to make an impact by enabling such algorithms to be verified correct. To date, there has been little work in this area, or on the memory models of real processors. Because of their subtlety, it can be exceptionally difficult to isolate any faults in correctness through testing, and their potential widespread use means that verified correctness (particularly at processor level) will be important.

Outside of academia, we expect our work to be useful to several types of companies:

- Hardware manufacturers, e.g., IBM, Intel, AMD, and ARM, who require better-performing software to enable introduction of additional cores in the processors they produce. Without software-side improvements, end-users will not benefit when increasing the number of cores.

- Programming language developers, e.g., Java (Oracle), C++, Scala, Python, who use predefined libraries to provide high-end functionality to programmers in a layered re-usable manner.

- Application developers, e.g., Mozilla (web applications), Valve Software (video gaming), Codeplay Software (SoC tools), etc., who require reliable high-performance libraries to take advantage of the available multi-core hardware.

- Formal verification companies, who will benefit from the techniques we use to verify novel concurrent data structures for relaxed memory. Of particular interest will be the mechanisations we develop in Isabelle.

- Large-scale (cloud-based) developers, e.g., Google, Facebook and Amazon, who are interested in correctness conditions such as causal and eventual consistency in the databases they use. There is a direct link between the correctness conditions we study and such notions of consistency.

Ultimately the impact of this will be to enhance the correctness of software using such algorithms, which, with multi-core technology and implementation of specific TM models, could see very widespread usage.

Publications

10 25 50