Trustworthy programming for multiple instruction sets

Lead Research Organisation: University of Cambridge
Department Name: Computer Science and Technology

Abstract

The rapidly growing use of microprocessors in devices containing sensitive data (e.g. phones) and safety-critical systems (e.g. automobiles, avionics) is increasing the value of trustworthy software. Assembly code is particularly error-prone as it varies from processor to processor and even between different versions of processors in the same family. Some software must be implemented directly in assembler, such as run-time system components (e.g. storage management), performance-critical operations (e.g. arithmetic) and parts of operating systems (e.g. interrupt controllers). One cannot avoid having to create at least some coderunning on bare metal .Our goal is to develop a new programming methodology for creating trustworthy assembly code software. The project has two parts: 1. bottom-up creation of certified code components using proof-producing decompilation of assembly code into mathematical function definitions; 2. top-down compilation of certified implementations from mathematical function definitionsThe certification aspects are novel: they consist of automatically proving a new kind of processor-specific formal specification.Unlike other recent work on certified assembly code, we aim to go beyond establishing weak safety properties and instead handle functional correctness, termination and resource usage. We aim to generate deep proofs using very accurate ISA models. Our goals are complementary to the relatively shallow analyses based on the simplified semantics that underlie current industrial-scale bug-finding formal software verification tools.Our methods are not tied to a particular instruction set. Initially we will work with two instruction sets: ARM and a subset of IA-32, both of which are very widely used. We already have access to formal specifications of these.We aim to conduct diverse and realistic case studies, including multi-word arithmetic as used in cryptography and storage allocation and management routines used for runtime support of compiled code. Towards the end of the project we hope to verify a complete interpreter for a simple language based on pure LISP supporting high precision arithmetic -- a first step towards creating verified implementations of functional languages on bare metal.A long-term application, probably beyond the scope of this project, is creating certified run-time code for real domain-specific functional languages. A motivating example is the Haskell-based Cryptol language, which is used for specifying cryptographic algorithms.We plan to recruit a PhD student to explore the feasibility of creating verified operating systems components such as drivers, networking software, software-hardware interfaces, boot loaders and virtualisation support. This will require modelling parts of the hardware environment. Verifying a complete operating system is likely to be too much for a single PhD student, but we intend to collaborate with students at the University of Utah.

Publications

10 25 50
 
Description Our main finding is that decmopilation into logic, i.e. automatic

extraction of tractable models from realistically modelled machine

code, is possible even for large programs. This technique can be used

for post hoc verification of existing code and also for synthesis of

code that is correct by construction.
Exploitation Route See "Exploitation routes" Our technique for decompilation into logic is already being used: the

National ICT of Australia uses it in the verification of the seL4 microkernel

(as part of the L4.verified project), Rockwell Collins (a large

aerospace and defense company in the US) uses it, and there is

currently an MoD project exploring how this might benefit the MoD.
Sectors Aerospace, Defence and Marine,Digital/Communication/Information Technologies (including Software)