Verifying concurrent algorithms on Weak Memory Models

Lead Research Organisation: University of Sheffield
Department Name: Computer Science

Abstract

Multi-core computing architectures have become ubiquitous over the last decade. This has been driven by the demand for continual performance improvements to cope with the every increasing sophistication of applications (for example, in the areas of graphics and audio processing). It has been necessary due to the constraints on chip manufacture which have prevented the continuation of performance improvements of earlier decades achieved primarily by speeding up sequential computation. The inherent parallelism these multi-core architectures entail offer great technical opportunities. Exploiting these opportunities presents a number of technical challenges.

The high-level aim of this project is to address two key technical challenges in the area. Firstly, in order to fully exploit the potential concurrency, programmers are developing very subtle concurrent algorithms which dispense with the need to lock shared memory and data structures. Linearizability is the standard correctness criterion for concurrent programs. However, the complexity of these algorithms means that checking their correctness with a high degree of confidence is extremely difficult. Verification is needed, and we seek to develop appropriate proof methods to support it.

Secondly, most prior work on correctness assumes a memory model (sequential consistency) which is not implemented in practice. In reality to increase efficiency, typical multicore systems communicate via shared memory and use relaxed memory models which give greater scope for optimization. These are the memory models implemented in processors such as x86, PowerPC, and ARM, and on these linearizability isn't the only relevant correctness criteria, and we shall also develop proof methods to quiescent consistency, which is emerging as an alternative correctness criteria on these processors.

Planned Impact

The global benefits of multi-core computing are clear - better performance, and more sophisticated applications.
Correct concurrent algorithms are key to ensuring those benefits are realised. Without concurrent algorithms the performance benefits multi-core can offer are diminished. Without correct concurrent algorithms the range of applications that can take advantage of these benefits is limited (to those where high reliability is not a concern). The use of fine-grained concurrent algorithms has already moved from niche applications (such as graphics and audio processing) to widely used computing infrastructure-operating systems, and programming libraries (e.g., java.util.concurrent). Work on verifying correctness of these algorithms will help translate the potential benefits of these developments into real, tangible ones. Consequently, the ultimate economic implications of this strand of research are enormous.

The use of lock-free algorithms is already standard in some applications, eg audio processors, operating systems and graphics. High-performance server software usually runs on multicore, and is increasingly using lock-free designs to improve performance. (eg see http://www.linuxjournal.com/content/lock-free-multi-producer-multi-consumer-queue-ring-buffer)

In addition, lock-free approaches are now entering the mainstream of Java, C++ etc. For example, until JDK 5.0, it was not possible to write wait-free, lock-free algorithms in the Java language without using native code, but with java.util.concurrent the situation has changed. The correct use of lock-free implementations of Java classes such as ConcurrentHashMap, AtomicInteger etc are frequently discussed on code bulletin boards.

Similarly, CDS is a C++ template library of lock-free and fine-grained algorithms. It contains a collection of concurrent data structure implementations including memory support for x86 etc and lists (eg lazy list), queues Michael and Scott), SkipList etc.

Thus it is likely to become the norm to write lock-free algorithms in order to increase the parallelism but at the expense of a subtle and difficult argument of correctness.

Our work seeks to make an impact by enabling such algorithms to be verified correct. However, to date there has been little work on correctness of linearizability or quiescent consistency on the memory models of real processors. This is the contribution we seek to make.

Because of their subtlety, it can be exceptionally difficult to isolate any faults in correctness through testing, and their potential widespread use means that verified correctness (particularly at processor level) will be important. Because of the particular characteristics of weak memory models, the methodology and results for sequentially consistent architectures do not carry over automatically.

Ultimately the impact of this will be to enhance the correctness of software using such algorithms, which, with multi-core technology and implementation of specific memory models, could see very widespread usage.

Publications

10 25 50