📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

CaMELot: Catching and Mitigating Event-Loop Concurrency Issues

Lead Research Organisation: University of Kent
Department Name: Sch of Computing

Abstract

Most modern computer applications depend in some way or another on computations that are performed by server applications on the internet. More and more of these server applications are now built as so-called microservices, which allow developers to gradually update or fix issues in unrelated parts of a larger application, and therefore, have become popular. Many of these microservices avoid certain types of concurrency issues by design. Unfortunately, they still suffer from other kinds of concurrency issues, for example when multiple online customers try to reserve the same seats at the same time.

For software engineers, it is hard to test for all possible concurrent interactions. In practice, this means that only simple concurrency issues are reliably detected during testing. Complex issues can however easily slip through and make it into server applications and then handle client requests incorrectly. One example of such a concurrency issue appeared at Nasdaq when the Facebook stock was traded for the first time, resulting in the loss of millions of dollars.

Our goal is to develop techniques that detect concurrency issues automatically at run time, to be able to circumvent them, and enable developers to fix them, using detailed information gathered by the detection techniques. Researchers have shown that one can detect and avoid issues, for instance by changing the order in which client requests are processed. In practice however, current techniques slow server applications down significantly, which make these techniques too costly to be used. Our aim is to dynamically balance the need for accurate information and minimize slow down. We conjecture that we can get most practical benefits while only rarely tracking precise details of how program code executes. In addition to automatically preventing concurrency issues to cause problems, we will also use the obtained information to provide feedback to developers so that they can fix the underlying issue in their software.

Thus, overall the goal of this research project is to make server applications, and specifically microservices, more robust and resilient to software bugs that are hard to test for and therefore typically remain undiscovered until they cause major issues for customers or companies.

Our work will result in the development of adaptive techniques that detect concurrency issues, and automatically tradeoff accuracy and run-time overhead, to be usable in practice. Furthermore, the detection techniques will be used to provide actionable input to the software developers, so that the concurrency issue can be fixed and therefore be prevented reliably in the future.

To evaluate this work, we will collect various different types of concurrency issues and make them openly available. This collection will be based on issues from industrial systems and derived from theoretical scenarios for highly complex bugs. We include these theoretical scenarios, since such complex bugs are hard to diagnose and test for, they likely remain undiagnosed and undocumented in practice, but have the potential of causing major disruptions.

Finally, we will build and evaluate our proposed techniques based on a system designed for concurrency research. The system uses the GraalVM technology of Oracle Labs, which allows us to prototype at the level of state-of-the-art systems, while keeping the development effort manageable for a small team.
 
Description One of the important questions of this work was whether we can apply concurrency bug detection on large scale applications.
As a step towards this goal, we analyzed large-scale Ruby applications.
Our findings show that their behavior is similar to the smaller applications that were studied previously, which means that we our approach to optimizations can likely be applied to them as well. However, the large scale of these applications causes major overheads when using just-in-time compilation techniques such as inlining and splitting. While these are needed to achieve best performance, it means large application will take hours instead of minutes to reach best performance. With a novel splitting strategy, we can reduce oversplitting by up to 70% in such systems, which reduces the amount of compilation the system has to do.

Furthermore, we investigated how the run-time representation in virtual machines can be optimized to reduce memory use and variability. With this technique, we are able to minimize the impact of our detection technique on memory use and performance. Our latest work further investigated the tradeoffs between basic interpreter techniques and found that with the meta-compilation technology available today, we can build and optimize languages based on tree-representation, which reduce engineering effort, and reach the same performance as classic instruction-oriented representations.
Exploitation Route Our current results will be of use to programming language implementers, such as large companies building browsers and compilers.
Sectors Digital/Communication/Information Technologies (including Software)

 
Description Industry Fellowship
Amount £120,003 (GBP)
Funding ID INF\R1\211001 
Organisation The Royal Society 
Sector Charity/Non Profit
Country United Kingdom
Start 08/2021 
End 08/2024
 
Description Automatic Superinstructions for Bytecode Interpreters 
Organisation Oracle Corporation
Country United States 
Sector Private 
PI Contribution We supported the Oracle GraalVM team in their design and implementation of a new domain-specific language for bytecode interpreters. We evaluated design options on our research platform, provided feedback, and improvement ideas, which eventually lead to productizing the language.
Collaborator Contribution The team at Oracle lead the design and implementation of the language. They gave us access to their prototypes for evaluation, and provided support to enable us to adapt the language.
Impact The domain-specific language was productized and is now part of the Oracle GraalVM product.
Start Year 2024
 
Description Debugging Technology and User Evaluations 
Organisation Vrije Universiteit Brussel
Country Belgium 
Sector Academic/University 
PI Contribution We contribute expertise on language implementation techniques, compilers, optimizations, and concurrency bug detection.
Collaborator Contribution Our partners contribute expertise on debugging, user studies and empirical evaluation, distributed systems, and bugs in distributed systems.
Impact The first outcome of the collaboration is a user study that was conducted, but the paper is currently under revision after first review rounds.
Start Year 2021
 
Description Large-Scale Automated Testing 
Organisation University of Ghent
Country Belgium 
Sector Academic/University 
PI Contribution We contributed our expertise on defining and implementing debugger protocols, which enabled a new approach to automated testing of large-scale systems.
Collaborator Contribution Prof. Scholliers and his group built and evaluated a new system for large-scale testing of embedded devices based on our debugging and debugger testing technology. They further refined the approach, including first steps on formalizing the semantics of the overall approach.
Impact The collaboration resulted in the implementation of a new software system Latch, as well as a draft paper, which currently is under revision.
Start Year 2022
 
Description Virtual Machine Design and Application Integration 
Organisation University of Buenos Aires
Country Argentina 
Sector Academic/University 
PI Contribution We contributed expertise on Virtual Machine Design and language implementation to this project.
Collaborator Contribution J. Pimás design and implemented a novel system that combines the virtual machine and application in the same codebase and programming language to reap the benefits of uniform tooling and representation across the full software stack. Traditionally, the virtual machine is intended to completely insulate the application from the implementation details. However, in his research, he found that violating this general premise can have major benefits for small software teams.
Impact We published a paper titled: Live Objects All The Way Down: Removing the Barriers between Applications and Virtual Machines DOI: https://doi.org/10.22152/programming-journal.org/2024/8/5
Start Year 2022
 
Title AST vs. Bytecode: Interpreters in the Age of Meta-Compilation (Artifact) 
Description This artifact accompanies our paper AST vs. Bytecode: Interpreters in the Age of Meta-Compilation to enable others to reuse our experimental setup and methodology, and verify our claims. Specifically, the artifacts covers our three contributions: It contains the implementation of our methodology to identify run-time performance and memory usage tradeoffs between AST and bytecode interpreters. Thus, it contains all benchmarks and experiments for reproduction of results, and reuse for new experiments, as well as the data we collected to verify our analysis. It contains PySOM and TruffleSOM, which both come with an AST and a bytecode interpreter to enable their comparison. It further contains all the variants of PySOM and TruffleSOM that assess the impact of specific optimizations. It allows to verify the key claim of our paper, that bytecode interpreters cannot be assumed to be faster than AST interpreters in the context of metacompilation systems. Paper Abstract Thanks to partial evaluation and meta-tracing, it became practical to build language implementations that reach state-of-the-art peak performance by implementing only an interpreter. Systems such as RPython and the GraalVM provide components such as a garbage collector and just-in-time compiler in a language-agnostic manner, greatly reducing implementation effort. However, meta-compilation-based language implementations still need to improve further to reach the low memory use and fast warmup behavior that custom-built systems provide. A key element in this endeavor is interpreter performance. Folklore tells us that bytecode interpreters are superior to abstract-syntax-tree (AST) interpreters both in terms of memory use and run-time performance. This work assesses the trade-offs between AST and bytecode interpreters to verify common assumptions and whether they hold in the context of meta-compilation systems. We implemented four interpreters, an AST and a bytecode one, based on RPython as well as GraalVM. We keep the difference between the interpreters as small as feasible to be able to evaluate interpreter performance, peak performance, warmup, memory use, and the impact of individual optimizations. Our results show that both systems indeed reach performance close to Node.js/V8. Looking at interpreter-only performance, our AST interpreters are on par with, or even slightly faster than their bytecode counterparts. After just-in-time compilation, the results are roughly on par. This means bytecode interpreters do not have their widely assumed performance advantage. However, we can confirm that bytecodes are more compact in memory than ASTs, which becomes relevant for larger applications. However, for smaller applications, we noticed that bytecode interpreters allocate more memory because boxing avoidance is not as applicable, and because the bytecode interpreter structure requires memory, e.g., for a reified stack. Our results show AST interpreters to be competitive on top of meta-compilation systems. Together with possible engineering benefits, they should thus not be discounted so easily in favor of bytecode interpreters. 
Type Of Technology Software 
Year Produced 2023 
Impact No notable impact. 
URL https://zenodo.org/record/8333815