Trustworthy programming for multiple instruction sets

Lead Research Organisation: University of Cambridge

Department Name: Computer Science and Technology

Abstract

The rapidly growing use of microprocessors in devices containing sensitive data (e.g. phones) and safety-critical systems (e.g. automobiles, avionics) is increasing the value of trustworthy software. Assembly code is particularly error-prone as it varies from processor to processor and even between different versions of processors in the same family. Some software must be implemented directly in assembler, such as run-time system components (e.g. storage management), performance-critical operations (e.g. arithmetic) and parts of operating systems (e.g. interrupt controllers). One cannot avoid having to create at least some coderunning on bare metal .Our goal is to develop a new programming methodology for creating trustworthy assembly code software. The project has two parts: 1. bottom-up creation of certified code components using proof-producing decompilation of assembly code into mathematical function definitions; 2. top-down compilation of certified implementations from mathematical function definitionsThe certification aspects are novel: they consist of automatically proving a new kind of processor-specific formal specification.Unlike other recent work on certified assembly code, we aim to go beyond establishing weak safety properties and instead handle functional correctness, termination and resource usage. We aim to generate deep proofs using very accurate ISA models. Our goals are complementary to the relatively shallow analyses based on the simplified semantics that underlie current industrial-scale bug-finding formal software verification tools.Our methods are not tied to a particular instruction set. Initially we will work with two instruction sets: ARM and a subset of IA-32, both of which are very widely used. We already have access to formal specifications of these.We aim to conduct diverse and realistic case studies, including multi-word arithmetic as used in cryptography and storage allocation and management routines used for runtime support of compiled code. Towards the end of the project we hope to verify a complete interpreter for a simple language based on pure LISP supporting high precision arithmetic -- a first step towards creating verified implementations of functional languages on bare metal.A long-term application, probably beyond the scope of this project, is creating certified run-time code for real domain-specific functional languages. A motivating example is the Haskell-based Cryptol language, which is used for specifying cryptographic algorithms.We plan to recruit a PhD student to explore the feasibility of creating verified operating systems components such as drivers, networking software, software-hardware interfaces, boot loaders and virtualisation support. This will require modelling parts of the hardware environment. Verifying a complete operating system is likely to be too much for a single PhD student, but we intend to collaborate with students at the University of Utah.

Funded Value:

£362,466

Funded Period:

Sep 08 - Mar 12

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/G007411/1

Principal Investigator:

Mike Gordon

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Fundamentals of Computing (100%)

Organisations

University of Cambridge (Lead Research Organisation)

People	ORCID iD
Mike Gordon (Principal Investigator)
Magnus Myreen (Researcher Co-Investigator)

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Alglave J (2009) The semantics of power and ARM multiprocessor machine code

Fox A (2010) Interactive Theorem Proving

Hainy M (2022) Optimal Bayesian design for model discrimination via classification. in Statistics and computing

Myreen M (2012) Interactive Theorem Proving

Myreen M (2011) Interactive Theorem Proving

Myreen M (2010) Interactive Theorem Proving

Myreen M (2012) Proof-producing synthesis of ML from higher-order logic in ACM SIGPLAN Notices

Myreen M (2009) Compiler Construction

Myreen M (2010) Verified Software: Theories, Tools, Experiments

Myreen M (2012) Proof-producing synthesis of ML from higher-order logic

Key Findings


Description	Our main finding is that decmopilation into logic, i.e. automatic extraction of tractable models from realistically modelled machine code, is possible even for large programs. This technique can be used for post hoc verification of existing code and also for synthesis of code that is correct by construction.
Exploitation Route	See "Exploitation routes" Our technique for decompilation into logic is already being used: the National ICT of Australia uses it in the verification of the seL4 microkernel (as part of the L4.verified project), Rockwell Collins (a large aerospace and defense company in the US) uses it, and there is currently an MoD project exploring how this might benefit the MoD.
Sectors	Aerospace Defence and Marine Digital/Communication/Information Technologies (including Software)

Abstract

Organisations

People

ORCID iD

Publications