📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Semantics-Directed Compiler Construction: From Formal Semantics to Certified Compilers

Lead Research Organisation: University of Nottingham
Department Name: School of Computer Science

Abstract

Compilers are central to computing, translating programs written by people into code for machines. Some aspects of compiler development, such as syntax analysis, bridge the theory and the implementation in a principled way, with lexers and parsers being algorithmically derived from high-level specifications. On the other hand, there is currently an unbridged gap between the theoretical specification of a programming language, given by a formal semantics, and the code produced by the compiler. Relating the two post hoc is possible, but difficult and rarely done. However, it doesn't have to be this way.

A more principled approach is to begin with a semantics for the language, and seek to derive an implementation that is correct-by-construction. The investigators (Graham Hutton and Dan Ghica) have independently developed two such methodologies, which are based on complementary approaches to semantics (evaluators and abstract machines), but utilise different approaches to syntax (trees and graphs). The aim of this project is to reconcile the two methodologies to develop scalable and reusable frameworks for constructing certified compilers from semantics.

The project combines the strengths of two leading research groups, is enhanced by a team of expert collaborators (Patrick Bahr, Mario Lavarez-Picallo, Edwin Brady, Simon Marlow, Anil Madhavapeddy, and Beniamino Accattoli), and is supported by fully-funded PhD studentships from the host departments.

Publications

10 25 50
publication icon
Bahr P (2024) Beyond Trees: Calculating Graph-Based Compilers (Functional Pearl) in Proceedings of the ACM on Programming Languages

publication icon
Hewer B (2024) Quotient Haskell: Lightweight Quotient Types for All in Proceedings of the ACM on Programming Languages

 
Title Haskell and Agda code for the article "Calculating Compilers Effectively" 
Description The Haskell and Agda source code for the paper      Calculating Effectful Compilers   Haskell (print.hs, state.hs, non-det.hs)     In these files we give practical implementations of each of the   three effects we consider in the paper.  For each one, we include   the full source code as described in the paper itself, and full   definitions for each of the concrete effect interpretations.   Agda (print.agda, state.agda, non-det.agda, lemmas.agda)     As noted in the paper, we use Agda to formally verify the correctness   of our calculations.  These four files constitute those proofs.   We use function extensionality, which is postulated in lemmas.agda.     Each Agda file begins by translating the types and functions from   Haskell into Agda's syntax.  We prove all relevant results:        * The functor and monad laws for each effect type        * Compiler correctness             exec (comp e c) s = do v <- eval e; exec c (v : s)        * Correctness of the fusion of run and exec:             exec' c s = run (exec c s)      There are some differences between the Agda presentation and that   given in the paper.  In particular:        * Agda does not like partial functions, so we augment the stack        with an arity (essentially making it a vector) and the code with        tags to describe how it transforms the shape of the stack.  This        is a standard, straightforward transformation which doesn't        affect the validity of the proofs, but keeps Agda happy.          * Agda's 'do' notation only permits one monad per file.  So we are        forced to translate the IO 'do' blocks into binds and lambdas.          * The calculations are presented as "calculational proofs".  Agda        does not provide any utility for calculating data types and        functions *during* equational proofs, so we structure these by        first writing down the calculated definition and then following        up with the proof that it is correct.            These proofs, while technically not calculations, still follow the        same steps as in the paper, so the reader can follow along. 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
URL https://zenodo.org/doi/10.5281/zenodo.12771956
 
Title Haskell and Agda code for the article "Calculating Compilers Effectively" 
Description The Haskell and Agda source code for the paper      Calculating Effectful Compilers   Haskell (print.hs, state.hs, non-det.hs)     In these files we give practical implementations of each of the   three effects we consider in the paper.  For each one, we include   the full source code as described in the paper itself, and full   definitions for each of the concrete effect interpretations.   Agda (print.agda, state.agda, non-det.agda, lemmas.agda)     As noted in the paper, we use Agda to formally verify the correctness   of our calculations.  These four files constitute those proofs.   We use function extensionality, which is postulated in lemmas.agda.     Each Agda file begins by translating the types and functions from   Haskell into Agda's syntax.  We prove all relevant results:        * The functor and monad laws for each effect type        * Compiler correctness             exec (comp e c) s = do v <- eval e; exec c (v : s)        * Correctness of the fusion of run and exec:             exec' c s = run (exec c s)      There are some differences between the Agda presentation and that   given in the paper.  In particular:        * Agda does not like partial functions, so we augment the stack        with an arity (essentially making it a vector) and the code with        tags to describe how it transforms the shape of the stack.  This        is a standard, straightforward transformation which doesn't        affect the validity of the proofs, but keeps Agda happy.          * Agda's 'do' notation only permits one monad per file.  So we are        forced to translate the IO 'do' blocks into binds and lambdas.          * The calculations are presented as "calculational proofs".  Agda        does not provide any utility for calculating data types and        functions *during* equational proofs, so we structure these by        first writing down the calculated definition and then following        up with the proof that it is correct.            These proofs, while technically not calculations, still follow the        same steps as in the paper, so the reader can follow along. 
Type Of Technology Software 
Year Produced 2024 
Open Source License? Yes  
URL https://zenodo.org/doi/10.5281/zenodo.12771955