Semantics-Directed Compiler Construction: From Formal Semantics to Certified Compilers
Lead Research Organisation:
University of Nottingham
Department Name: School of Computer Science
Abstract
Compilers are central to computing, translating programs written by people into code for machines. Some aspects of compiler development, such as syntax analysis, bridge the theory and the implementation in a principled way, with lexers and parsers being algorithmically derived from high-level specifications. On the other hand, there is currently an unbridged gap between the theoretical specification of a programming language, given by a formal semantics, and the code produced by the compiler. Relating the two post hoc is possible, but difficult and rarely done. However, it doesn't have to be this way.
A more principled approach is to begin with a semantics for the language, and seek to derive an implementation that is correct-by-construction. The investigators (Graham Hutton and Dan Ghica) have independently developed two such methodologies, which are based on complementary approaches to semantics (evaluators and abstract machines), but utilise different approaches to syntax (trees and graphs). The aim of this project is to reconcile the two methodologies to develop scalable and reusable frameworks for constructing certified compilers from semantics.
The project combines the strengths of two leading research groups, is enhanced by a team of expert collaborators (Patrick Bahr, Mario Lavarez-Picallo, Edwin Brady, Simon Marlow, Anil Madhavapeddy, and Beniamino Accattoli), and is supported by fully-funded PhD studentships from the host departments.
A more principled approach is to begin with a semantics for the language, and seek to derive an implementation that is correct-by-construction. The investigators (Graham Hutton and Dan Ghica) have independently developed two such methodologies, which are based on complementary approaches to semantics (evaluators and abstract machines), but utilise different approaches to syntax (trees and graphs). The aim of this project is to reconcile the two methodologies to develop scalable and reusable frameworks for constructing certified compilers from semantics.
The project combines the strengths of two leading research groups, is enhanced by a team of expert collaborators (Patrick Bahr, Mario Lavarez-Picallo, Edwin Brady, Simon Marlow, Anil Madhavapeddy, and Beniamino Accattoli), and is supported by fully-funded PhD studentships from the host departments.
Organisations
- University of Nottingham (Lead Research Organisation)
- IT University of Copenhagen (Project Partner)
- University of Cambridge (Project Partner)
- Meta (Previously Facebook) (Project Partner)
- University of St Andrews (Project Partner)
- Huawei Technologies (UK) Co Ltd (Project Partner)
- INRIA (Project Partner)
People |
ORCID iD |
| Graham Hutton (Principal Investigator) |
Publications
Bahr P
(2024)
Beyond Trees: Calculating Graph-Based Compilers (Functional Pearl)
in Proceedings of the ACM on Programming Languages
Garby Z
(2024)
Calculating Compilers Effectively (Functional Pearl)
Hewer B
(2024)
Quotient Haskell: Lightweight Quotient Types for All
in Proceedings of the ACM on Programming Languages
| Title | Haskell and Agda code for the article "Calculating Compilers Effectively" |
| Description | The Haskell and Agda source code for the paper Calculating Effectful Compilers Haskell (print.hs, state.hs, non-det.hs) In these files we give practical implementations of each of the three effects we consider in the paper. For each one, we include the full source code as described in the paper itself, and full definitions for each of the concrete effect interpretations. Agda (print.agda, state.agda, non-det.agda, lemmas.agda) As noted in the paper, we use Agda to formally verify the correctness of our calculations. These four files constitute those proofs. We use function extensionality, which is postulated in lemmas.agda. Each Agda file begins by translating the types and functions from Haskell into Agda's syntax. We prove all relevant results: * The functor and monad laws for each effect type * Compiler correctness exec (comp e c) s = do v <- eval e; exec c (v : s) * Correctness of the fusion of run and exec: exec' c s = run (exec c s) There are some differences between the Agda presentation and that given in the paper. In particular: * Agda does not like partial functions, so we augment the stack with an arity (essentially making it a vector) and the code with tags to describe how it transforms the shape of the stack. This is a standard, straightforward transformation which doesn't affect the validity of the proofs, but keeps Agda happy. * Agda's 'do' notation only permits one monad per file. So we are forced to translate the IO 'do' blocks into binds and lambdas. * The calculations are presented as "calculational proofs". Agda does not provide any utility for calculating data types and functions *during* equational proofs, so we structure these by first writing down the calculated definition and then following up with the proof that it is correct. These proofs, while technically not calculations, still follow the same steps as in the paper, so the reader can follow along. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| URL | https://zenodo.org/doi/10.5281/zenodo.12771956 |
| Title | Haskell and Agda code for the article "Calculating Compilers Effectively" |
| Description | The Haskell and Agda source code for the paper Calculating Effectful Compilers Haskell (print.hs, state.hs, non-det.hs) In these files we give practical implementations of each of the three effects we consider in the paper. For each one, we include the full source code as described in the paper itself, and full definitions for each of the concrete effect interpretations. Agda (print.agda, state.agda, non-det.agda, lemmas.agda) As noted in the paper, we use Agda to formally verify the correctness of our calculations. These four files constitute those proofs. We use function extensionality, which is postulated in lemmas.agda. Each Agda file begins by translating the types and functions from Haskell into Agda's syntax. We prove all relevant results: * The functor and monad laws for each effect type * Compiler correctness exec (comp e c) s = do v <- eval e; exec c (v : s) * Correctness of the fusion of run and exec: exec' c s = run (exec c s) There are some differences between the Agda presentation and that given in the paper. In particular: * Agda does not like partial functions, so we augment the stack with an arity (essentially making it a vector) and the code with tags to describe how it transforms the shape of the stack. This is a standard, straightforward transformation which doesn't affect the validity of the proofs, but keeps Agda happy. * Agda's 'do' notation only permits one monad per file. So we are forced to translate the IO 'do' blocks into binds and lambdas. * The calculations are presented as "calculational proofs". Agda does not provide any utility for calculating data types and functions *during* equational proofs, so we structure these by first writing down the calculated definition and then following up with the proof that it is correct. These proofs, while technically not calculations, still follow the same steps as in the paper, so the reader can follow along. |
| Type Of Technology | Software |
| Year Produced | 2024 |
| Open Source License? | Yes |
| URL | https://zenodo.org/doi/10.5281/zenodo.12771955 |