DIADEM: debugging made dependable and measurable
Lead Research Organisation:
King's College London
Department Name: Informatics
Abstract
Software quality is increasingly critical to most of humanity. Bugs in software claim a huge annual toll, financially and even in human life. To eliminate bugs, developers depend crucially on their tools. Tools for interactive debugging are vital. They alone provide a high- (source-) level view of a running (binary) program, enabling programmers to 'see the bug' as it occurs in the program running in front of them. However, debugging infrastructure is notoriously unreliable, as it works only if various metadata is complete and correct. If not, the programmer sees a partial or incorrect view, which may be useless or actively misleading.
These problems occur often in popular languages (e.g. C, C++, Rust, Go), owing to a tension between debuggability and optimisation. Debugging in these languages works by compiler-generated /metadata/ describing how binary (executable) code relates to source (human-written) code. Metadata generation is 'best-effort', and optimisation frequently introduces flaws -- but simply disabling optimisations is seldom an option. Programmers rely on optimisations to relieve them of much hand-tuning. Without them, code may run tens of times slower. Furthermore, some bugs appear only in optimised code, owing to undefined behaviour (underspecification) in the source language.
This problem is extremely challenging. The heart of it is that writing compiler optimisations that preserve debuggability demands extra effort, on what are already intricate code transformations ('passes'). In practice corners are cut, leaving the output metadata approximate. To be acceptable to pass authors, improvements must reshape the effort/reward curve without increasing the task's baseline complexity. Unlike performance, debugging so far lacks quantitative benchmarks, so compiler authors have not prioritised or competed on debuggability.
Existing techniques amount to interaction-based testing of debugging, often with features for narrowing down which passes introduced a flaw. This is haphazard, since exploring all metadata includes the already-hard problem of achieving full-coverage tests (to 'drive' the debugger over all program locations). We propose instead to analyse metadata as an artifact in its own right. This means instead of tests that interact with a single concrete execution through a debugger, we must devise a custom systematic, symbolic method for exploring the compiled code, evaluating the correctness of metadata in a mathematical manner. Unlike haphazard testing, this promises systematic measurement of lost coverage and correctness; the latter can (we hypothesise) be automated using recent advances in formal specification of source languages, namely /executable semantics/, as a replacement for the current manual practices. This idea of parallel source- and binary-level exploration also suggests a radical approach: post-hoc synthesis of metadata, relieving the compiler of generating it at all. The idea here builds on successful work on neighbouring problems (translation validation and decompilation).
The project will proceed by practical methods, experimenting on a real production compiler (LLVM). It will build novel tools embeddable into existing compiler-testing workflows, both to diagnose compiler bugs and to quantify the improvement from fixing them. It will empirically explore abstractions and helpers used internally in compilers, to devise designs making them measurably more debug-preserving. Finally it will build a novel tool exploring the radical idea of synthesising high-quality metadata in post-hoc fashion, outside the compiler. It will develop metrics allowing quantitative comparison against traditional approaches. The beneficiaries are on many levels: compiler authors, software developers at large, and the general public who use or depend on the affected software.
These problems occur often in popular languages (e.g. C, C++, Rust, Go), owing to a tension between debuggability and optimisation. Debugging in these languages works by compiler-generated /metadata/ describing how binary (executable) code relates to source (human-written) code. Metadata generation is 'best-effort', and optimisation frequently introduces flaws -- but simply disabling optimisations is seldom an option. Programmers rely on optimisations to relieve them of much hand-tuning. Without them, code may run tens of times slower. Furthermore, some bugs appear only in optimised code, owing to undefined behaviour (underspecification) in the source language.
This problem is extremely challenging. The heart of it is that writing compiler optimisations that preserve debuggability demands extra effort, on what are already intricate code transformations ('passes'). In practice corners are cut, leaving the output metadata approximate. To be acceptable to pass authors, improvements must reshape the effort/reward curve without increasing the task's baseline complexity. Unlike performance, debugging so far lacks quantitative benchmarks, so compiler authors have not prioritised or competed on debuggability.
Existing techniques amount to interaction-based testing of debugging, often with features for narrowing down which passes introduced a flaw. This is haphazard, since exploring all metadata includes the already-hard problem of achieving full-coverage tests (to 'drive' the debugger over all program locations). We propose instead to analyse metadata as an artifact in its own right. This means instead of tests that interact with a single concrete execution through a debugger, we must devise a custom systematic, symbolic method for exploring the compiled code, evaluating the correctness of metadata in a mathematical manner. Unlike haphazard testing, this promises systematic measurement of lost coverage and correctness; the latter can (we hypothesise) be automated using recent advances in formal specification of source languages, namely /executable semantics/, as a replacement for the current manual practices. This idea of parallel source- and binary-level exploration also suggests a radical approach: post-hoc synthesis of metadata, relieving the compiler of generating it at all. The idea here builds on successful work on neighbouring problems (translation validation and decompilation).
The project will proceed by practical methods, experimenting on a real production compiler (LLVM). It will build novel tools embeddable into existing compiler-testing workflows, both to diagnose compiler bugs and to quantify the improvement from fixing them. It will empirically explore abstractions and helpers used internally in compilers, to devise designs making them measurably more debug-preserving. Finally it will build a novel tool exploring the radical idea of synthesising high-quality metadata in post-hoc fashion, outside the compiler. It will develop metrics allowing quantitative comparison against traditional approaches. The beneficiaries are on many levels: compiler authors, software developers at large, and the general public who use or depend on the affected software.
People |
ORCID iD |
| Stephen Kell (Principal Investigator) |
Publications
Ryan Stinnett J.
(2024)
Accurate Coverage Metrics for Compiler-Generated Debugging Information
in CC 2024 - Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction
Stinnett J
(2024)
Accurate Coverage Metrics for Compiler-Generated Debugging Information
| Description | Impact Acceleration Award (EPSRC) |
| Amount | £43,000 (GBP) |
| Organisation | King's College London |
| Sector | Academic/University |
| Country | United Kingdom |
| Start | 07/2024 |
| End | 08/2025 |
| Description | DARPA E-BOSS programme participation with Galois (BRACELET project) |
| Organisation | Galois, Inc. |
| Country | United States |
| Sector | Private |
| PI Contribution | We provide complementary expertise on improving the debugging information available at crash reporting time, benefiting the quality of the technical output of the tools being created during this follow-on project (assisting triage, reachability analysis and the like). |
| Collaborator Contribution | Galois contributes the core reachability analysis expertise, in addition to engineering support. |
| Impact | early days / no outputs yet |
| Start Year | 2024 |
| Description | SN Systems debug info technology transfer partnership |
| Organisation | SONY |
| Department | Sony Interactive Entertainment |
| Country | Japan |
| Sector | Private |
| PI Contribution | We bring research on compiler debugging support, in or approaching a technology-transfer-ready state. |
| Collaborator Contribution | They mentor and advise on the technology transfer process, interfacing with upstream developer commnunities etc, including advising on technical specifics of relevant projects |
| Impact | still working on EPSRC IAA-funded output |
| Start Year | 2024 |
| Title | Accurate Coverage Metrics for Compiler-Generated Debugging Information (artifact) |
| Description | This is the artifact for the CC 2024 paper: Accurate Coverage Metrics for Compiler-Generated Debugging Information by J. Ryan Stinnett and Stephen Kell. The source for this artifact is available at https://github.com/jryans/debug-info-metrics-artifact |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Impact | An updated version of this artifact has been agreed in principle to be accepted into the LLVM compiler toolchain. |
| URL | https://zenodo.org/doi/10.5281/zenodo.10568392 |
| Title | Accurate Coverage Metrics for Compiler-Generated Debugging Information (artifact) |
| Description | This is the artifact for the CC 2024 paper: Accurate Coverage Metrics for Compiler-Generated Debugging Information by J. Ryan Stinnett and Stephen Kell. The source for this artifact is available at https://github.com/jryans/debug-info-metrics-artifact |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| Impact | Early days... |
| URL | https://zenodo.org/doi/10.5281/zenodo.10568391 |
| Title | Debug info checker within KLEE |
| Description | A tool for identifying inconsistencies between the debugging information present in optimised and unoptimised LLVM IR compilations. |
| Type Of Technology | Software |
| Year Produced | 2022 |
| Open Source License? | Yes |
| Impact | Various compiler bugs have been reported and acknowledged to/by the LLVM project. Fixes are ongoing. |
| URL | https://github.com/jryans/klee/tree/debug-info-check |
| Description | Debugging round table at the EuroLLVM developer meeting, May 2023 |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | About 40 people, mostly practitioners but some students etc, attended a round-table discussion of the future of debugging information generation in the LLVM compiler toolchain. The discussion was chaired by Stinnett, RA on the DIADEM project. |
| Year(s) Of Engagement Activity | 2023 |
| Description | HYTRADBOI conference |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | I gave a talk at an online industry-facing conference, HYTRADBOI, the goal being to familiarise software developers with our research and its underlying practices and perspectives. Q&A and follow-up discussion of my talk engaged around 15 individuals from a variety of backgrounds in software development. |
| Year(s) Of Engagement Activity | 2025 |
| URL | https://www.hytradboi.com/2025 |
| Description | Talk at the KLEE workshop, September 2022 |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Other audiences |
| Results and Impact | A workshop mixing academic researchers, industrial researchers and industrial practitioners, focused on tools or experiments conducted on/with the KLEE symbolic execution engine |
| Year(s) Of Engagement Activity | 2022 |
| URL | https://srg.doc.ic.ac.uk/klee22/ |