Semantics-Directed Compiler Construction: From Formal Semantics to Certified Compilers

Lead Research Organisation: University of Nottingham

Department Name: School of Computer Science

Abstract

Compilers are central to computing, translating programs written by people into code for machines. Some aspects of compiler development, such as syntax analysis, bridge the theory and the implementation in a principled way, with lexers and parsers being algorithmically derived from high-level specifications. On the other hand, there is currently an unbridged gap between the theoretical specification of a programming language, given by a formal semantics, and the code produced by the compiler. Relating the two post hoc is possible, but difficult and rarely done. However, it doesn't have to be this way.

A more principled approach is to begin with a semantics for the language, and seek to derive an implementation that is correct-by-construction. The investigators (Graham Hutton and Dan Ghica) have independently developed two such methodologies, which are based on complementary approaches to semantics (evaluators and abstract machines), but utilise different approaches to syntax (trees and graphs). The aim of this project is to reconcile the two methodologies to develop scalable and reusable frameworks for constructing certified compilers from semantics.

The project combines the strengths of two leading research groups, is enhanced by a team of expert collaborators (Patrick Bahr, Mario Lavarez-Picallo, Edwin Brady, Simon Marlow, Anil Madhavapeddy, and Beniamino Accattoli), and is supported by fully-funded PhD studentships from the host departments.

Funded Value:

£466,425

Funded Period:

Mar 24 - Mar 27

Funder:

EPSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

EP/Y010744/1

Principal Investigator:

Graham Hutton

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Fundamentals of Computing (100%)

Organisations

People	ORCID iD
Graham Hutton (Principal Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Bahr P (2024) Beyond Trees: Calculating Graph-Based Compilers (Functional Pearl) in Proceedings of the ACM on Programming Languages

Garby Z (2024) Calculating Compilers Effectively (Functional Pearl)

Hewer B (2024) Quotient Haskell: Lightweight Quotient Types for All in Proceedings of the ACM on Programming Languages

Software and Technical Products


Title	Haskell and Agda code for the article "Calculating Compilers Effectively"
Description	The Haskell and Agda source code for the paper Calculating Effectful Compilers Haskell (print.hs, state.hs, non-det.hs) In these files we give practical implementations of each of the three effects we consider in the paper. For each one, we include the full source code as described in the paper itself, and full definitions for each of the concrete effect interpretations. Agda (print.agda, state.agda, non-det.agda, lemmas.agda) As noted in the paper, we use Agda to formally verify the correctness of our calculations. These four files constitute those proofs. We use function extensionality, which is postulated in lemmas.agda. Each Agda file begins by translating the types and functions from Haskell into Agda's syntax. We prove all relevant results: * The functor and monad laws for each effect type * Compiler correctness exec (comp e c) s = do v <- eval e; exec c (v : s) * Correctness of the fusion of run and exec: exec' c s = run (exec c s) There are some differences between the Agda presentation and that given in the paper. In particular: * Agda does not like partial functions, so we augment the stack with an arity (essentially making it a vector) and the code with tags to describe how it transforms the shape of the stack. This is a standard, straightforward transformation which doesn't affect the validity of the proofs, but keeps Agda happy. * Agda's 'do' notation only permits one monad per file. So we are forced to translate the IO 'do' blocks into binds and lambdas. * The calculations are presented as "calculational proofs". Agda does not provide any utility for calculating data types and functions during equational proofs, so we structure these by first writing down the calculated definition and then following up with the proof that it is correct. These proofs, while technically not calculations, still follow the same steps as in the paper, so the reader can follow along.
Type Of Technology	Software
Year Produced	2024
Open Source License?	Yes
URL	https://zenodo.org/doi/10.5281/zenodo.12771956


Title	Haskell and Agda code for the article "Calculating Compilers Effectively"
Description	The Haskell and Agda source code for the paper Calculating Effectful Compilers Haskell (print.hs, state.hs, non-det.hs) In these files we give practical implementations of each of the three effects we consider in the paper. For each one, we include the full source code as described in the paper itself, and full definitions for each of the concrete effect interpretations. Agda (print.agda, state.agda, non-det.agda, lemmas.agda) As noted in the paper, we use Agda to formally verify the correctness of our calculations. These four files constitute those proofs. We use function extensionality, which is postulated in lemmas.agda. Each Agda file begins by translating the types and functions from Haskell into Agda's syntax. We prove all relevant results: * The functor and monad laws for each effect type * Compiler correctness exec (comp e c) s = do v <- eval e; exec c (v : s) * Correctness of the fusion of run and exec: exec' c s = run (exec c s) There are some differences between the Agda presentation and that given in the paper. In particular: * Agda does not like partial functions, so we augment the stack with an arity (essentially making it a vector) and the code with tags to describe how it transforms the shape of the stack. This is a standard, straightforward transformation which doesn't affect the validity of the proofs, but keeps Agda happy. * Agda's 'do' notation only permits one monad per file. So we are forced to translate the IO 'do' blocks into binds and lambdas. * The calculations are presented as "calculational proofs". Agda does not provide any utility for calculating data types and functions during equational proofs, so we structure these by first writing down the calculated definition and then following up with the proof that it is correct. These proofs, while technically not calculations, still follow the same steps as in the paper, so the reader can follow along.
Type Of Technology	Software
Year Produced	2024
Open Source License?	Yes
URL	https://zenodo.org/doi/10.5281/zenodo.12771955

Abstract

Organisations

People

ORCID iD

Publications