Verifying Resource-like Data Use in Programs via Types

Lead Research Organisation: University of Kent

Department Name: Sch of Computing

Abstract

Software is integral to the fabric of our lives, controlling transport, the economy and
infrastructure, and providing the main tools of work and leisure. The cost of software failures is
therefore high, to productivity, stability, safety, and privacy. As an indication of the economic
impact, it is estimated that software errors cost the US economy tens of billions of dollars last
decade. Software errors can also impact human safety as software is used to control transport,
infrastructure, and medical equipment. The research agenda of program verification aims to mitigate
these risks by putting software engineering on a rigorous foundation through techniques to guarantee
program correctness. As software becomes more complex and pervasive, program verification is ever
more important. This work aims to advance the state-of-the-art in verification at the point of
program development through advancements in programming language theory and practice.

Programming languages are the core tool in which software is constructed; they are the means of
communicating our intentions to computer hardware. There are various design trade-offs when creating
a programming language, which has led to the variety of programming languages in use today. Some
languages support verification by including a "type system" which categorises data and operations,
ensuring that operations are only applied to data of the correct type. This provides some guarantees
about the correctness of the program by ruling out various kinds of error before the program is ever
executed. A subfield of programming language research aims to make type systems more expressive so
that they can describe and enforce more properties of programs and thus raise the level of
verification possible within the language. The current proposal aims to do just this, focussing on
the notion of data as a "resource" subject to constraints which should be enforced by a language's
type system. This project develops a particular technique for building such type systems in a way
that can capture various kinds of property in one system.

The manipulation of data is central to the task of programming. Current programming languages
essentially view data as an infinitely-replicable resource that can be stored, manipulated, and
communicated without restriction. However, this perspective is naive. Some data is private and thus
should not be arbitrarily copied, stored, or communicated. Some data is large and thus should not be
replicated too frequently. Some data acts as a way of interfacing with other parts of a system,
e.g., files or communication channels, which are then subject to restrictions on how they can be
used, such as agreed protocols of interaction. Most programming languages are agnostic to these
constraints however, treating data as totally unconstrained. This can lead to various software
errors, including privacy breaches, performance problems, and crashes due to incorrect interactions
between parts of a system. The goal of this research is to embed the constraints associated with
data into the type system of a language so that these additional constraints can be automatically
enforced. The key technology for doing this is a novel notion of types called "graded modal types"
which can capture and track different kinds of information about the structure of programs and the
flow of data through them.

This work will develop the theory and practice of graded modal types, producing a prototype language
that demonstrates their use for verifying program properties. Three case studies will be carried out
to demonstrate their power: (1) ensuring privacy and confidentiality, (2) capturing and reasoning
about performance e.g., how fast a program will execute relative to the size of its inputs and how
much memory it will consume (3) enforcing fine-grained protocols of interaction.

This work is a step towards a new generation of trustworthy software.

Planned Impact

The pervasive nature of software in our society means that the tools used for creating software have
wide-reaching impact. Programming languages in particular impact the ability of software engineers
(developers) to read, write, and reason about their programs, as well as the ability of computers to
effectively execute programs. This project is primarily concerned with making it easier for
developers to reason about the use of data in their programs, and to ensure that extra-functional
constraints associated to data (e.g., confidentiality) are respected. This kind of reasoning is
particularly important for developers of software used in high-risk contexts, e.g., travel, medical
applications, infrastructure, and for developers of software that handles sensitive data, e.g.,
mobile phone apps, websites, financial services. The techniques to provide this enhanced ability
for program reasoning and verification involve scientific advances in the area of programming
language theory. The practical application of this theory will deliver a novel programming approach,
exemplified in a new language which can assist program verification. Thus this project will deliver
complementary impact across both academia and industry, and further to society and the economy via
its impact in the software development industry.

In terms of industrial impact, and beyond, there are two many routes in which the main practical
deliverable, the language Granule, can have impact: (1) by becoming a mature programming language
that can be used by developers; (2) by influencing the design of existing and future programming
languages. Within the lifetime of this project, the latter mode of impact is the most likely, though
the impact activities here support both routes to delivering impact.

The Granule implementation will be open source, allowing ideas to be easily adopted by other
language implementers and encouraging a community of developers to grow, contributing to Granule
directly. The language will be promoted online, including via a series of online tutorial videos,
written tutorials, and online documentation. In the second year of the project, an online hackathon
competition will be run to promote Granule and to encourage early contributions and feedback from
practitioners. Industry talks will also be given and an academic workshop at the end of year 1 will
also be open to industrials. All these impact activities will not only help to promote the work to
industry and other language implementors, but will also assist the academic impact as well.

The impact and popularity of a language often relates to the amount and quality of library
support. Therefore, there will be a focus in the second year of the project on improving the
libraries and encouraging development of these by others outside the project team (see above).

Running throughout the project are three key practical application case studies: privacy, cost, and
protocols. The case studies will help to ensure that the research links with practical
applications, and will then serve as a way to demonstrate the verification technique developed
here. The privacy case study is in collaboration with industrial partner Galois Inc., and is
particularly relevant in the context of modern software development, where privacy is increasingly a
concern and there are increasing legal requirements to meet data protection standards.

Following from the industrial impact, the benefits to society and the economy are then yielded in
terms of correct and trustworthy software, with fewer bugs, which in turn reduces the cost of
software and reduces risk to stability, safety, and privacy. Take together, the long-term goals of
the project help to support a more rigorous engineering foundation for the UK's considerable
software industry, and further afield.

Funded Value:

£229,327

Funded Period:

Jul 20 - Jun 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/T013516/1

Principal Investigator:

Dominic Orchard

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Fundamentals of Computing (75%)

Software Engineering (25%)

Organisations

People	ORCID iD
Dominic Orchard (Principal Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Bocchi L (2022) A Theory of Composing Protocols in The Art, Science, and Engineering of Programming

Gaboardi M (2021) Programming Languages and Systems - 30th European Symposium on Programming, ESOP 2021, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021, Luxembourg City, Luxembourg, March 27 - April 1, 2021, Proceedings

Hughes J (2021) Logic-Based Program Synthesis and Transformation - 30th International Symposium, LOPSTR 2020, Bologna, Italy, September 7-9, 2020, Proceedings

Hughes J (2021) Linear Exponentials as Graded Modal Types

Ivaškovic, A. (2020) Data-flow analyses as effects and graded monads in Leibniz International Proceedings in Informatics, LIPIcs

Koparkar C (2021) Efficient tree-traversals: reconciling parallelism and dense data representations in Proceedings of the ACM on Programming Languages

Marshall D (2022) Programming Languages and Systems - 31st European Symposium on Programming, ESOP 2022, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Munich, Germany, April 2-7, 2022, Proceedings

Marshall D (2022) Replicate, Reuse, Repeat: Capturing Non-Linear Communication via Session Types and Graded Modal Types in Electronic Proceedings in Theoretical Computer Science

Marshall D (2022) How to Take the Inverse of a Type

Moon B (2021) Programming Languages and Systems - 30th European Symposium on Programming, ESOP 2021, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021, Luxembourg City, Luxembourg, March 27 - April 1, 2021, Proceedings

Key Findings
Research Databases and Models
Collaboration
Software and Technical Products
Engagement Activities


Description	Given an adequate specification of a computer program, a computer can automatically generate a program to satisfy the specification; this is a technique known as program synthesis which depends on having enough information in the specification, and a sophisticated enough synthesis technique, in order to generate suitable programs. In the first six months of this award we have devised a new technique for synthesising certain kinds of programs based on specifications about how they use their input data, specified in an abstract form which is very different to the program itself, and requires little effort from the human user. This leverages two key theoretical ideas at heart of this project which allow a programming language to describe how data is used in a general way which can be used to talk about privacy and confidentiality of data. This work constitutes a leap forward in leveraging this information for automatic program synthesis. The project has also developed a techniques for understanding computer programs by generalising and extending various existing ideas in the literature. Furthermore, we have developed new models for existing programming languages. Taken together, these deepen our understanding of how to reason about programs which can be used, for example, to determine program correctness.
Exploitation Route	The technique developed can be deployed in other languages which has some of the same key ideas as our research language: Haskell and Idris. We are working on a variant of our technique that can be deployed in these more main stream languages.
Sectors	Digital/Communication/Information Technologies (including Software)
URL	https://github.com/granule-project/granule-project.github.io/blob/master/synth-tutorial.md


Title	Additional proofs and code for "Data-flow analyses as effects and graded monads"
Description	This deposit provides code and additional proofs associated to the paper "Data-flow analyses as effects and graded monads" appearing at FSCD 2020 (5th International Conference On Formal Structures for Computation and Deduction). extra-proofs.pdf provides additional proofs not included in the appendix of the published paper for space reasons. GradedMonad.agda provides further mechanised proofs, referred to from extra-proofs.pdf dataflow-effects-as-grades-fscd2020.zip provides the source code corresponding to Section 4.4 and Appendix B The code is hosted on GitHub as well: https://github.com/dorchard/dataflow-effects-as-grades This .zip corresponds to this release https://github.com/dorchard/dataflow-effects-as-grades/releases/tag/fscd2020 Unzip and see README.md for details on how to build and interact with this code
Type Of Material	Database/Collection of data
Year Produced	2020
Provided To Others?	Yes
Impact	This deposit provides a library code and additional proofs associated to the paper "Data-flow analyses as effects and graded monads" appearing at FSCD 2020 (5th International Conference On Formal Structures for Computation and Deduction). This provided code can be reused for future works as a building block.
URL	https://zenodo.org/record/3784967


Title	Granule programming language implementation
Description	In functional programming, regular types are a subset of algebraic data types formed from products and sums with their respective units. One can view regular types as forming a commutative semiring but where the usual axioms are isomorphisms rather than equalities. In the pearl, we show that regular types in a linear setting permit a useful notion of multiplicative inverse, allowing us to 'divide' one type by another. We begin with an exploration of the properties and applications of this construction, and this artifact includes examples in Haskell demonstrating its relevance to various topics from the literature including program calculation, Laurent polynomials, and derivatives of data types. Some examples from the paper require a richer setting than Haskell can offer; the artifact replays the first set of examples in Granule, while also presenting additional examples demonstrating further algebraic structure through linear functions that incorporate local side effects
Type Of Material	Computer model/algorithm
Year Produced	2022
Provided To Others?	Yes
Impact	This was the artefact for a paper. The artefact won the Distinguished Artefact Award. The paper also won Distinguished Paper Award.
URL	https://drops.dagstuhl.de/opus/volltexte/2022/16199/


Description	Academic collaboration on Graded Hoare Logic
Organisation	Boston University
Country	United States
Sector	Academic/University
PI Contribution	Key academic contributions to a new paper, with roughly equal split between partners.
Collaborator Contribution	Key academic contributions to a new paper, with roughly equal split between partners.
Impact	Publication "Graded Hoare Logic and its Categorical Semantics" https://link.springer.com/chapter/10.1007/978-3-030-72019-3_9
Start Year	2020


Description	Academic collaboration on Graded Hoare Logic
Organisation	National Institute of Informatics (NII)
Country	Japan
Sector	Public
PI Contribution	Key academic contributions to a new paper, with roughly equal split between partners.
Collaborator Contribution	Key academic contributions to a new paper, with roughly equal split between partners.
Impact	Publication "Graded Hoare Logic and its Categorical Semantics" https://link.springer.com/chapter/10.1007/978-3-030-72019-3_9
Start Year	2020


Description	Academic collaboration on Graded Hoare Logic
Organisation	Tokyo Institute of Technology
Country	Japan
Sector	Academic/University
PI Contribution	Key academic contributions to a new paper, with roughly equal split between partners.
Collaborator Contribution	Key academic contributions to a new paper, with roughly equal split between partners.
Impact	Publication "Graded Hoare Logic and its Categorical Semantics" https://link.springer.com/chapter/10.1007/978-3-030-72019-3_9
Start Year	2020


Title	Gerty
Description	A new research prototype language accompanying our ESOP 2021 on 'Graded Modal Dependent Type Theory'
Type Of Technology	Webtool/Application
Year Produced	2021
Open Source License?	Yes
Impact	Demonstrates a novel type system and its practical implications which goes beyond the existing state-of-the-art.
URL	https://github.com/granule-project/gerty/


Title	Granule - A modern functional programming language with graded linear and modal types
Description	Latest release of the Granule programming language which includes results from publications arising from this grant in the latter half of 2020, including our paper on Resource Program Synthesis for Graded Linear Types which is deployed here as a new program synthesis procedure.
Type Of Technology	Software
Year Produced	2020
Open Source License?	Yes
Impact	This software release has been used by the team in various presentations, including in an invited talk to Augusta University, and on online end of year review video.
URL	https://github.com/granule-project/granule/releases/tag/v0.8.0.0


Description	Bristol visit
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Postgraduate students
Results and Impact	Invited research visit to Bristol to discuss research ideas of this project with the Bristol Programming languages group and deliver a research seminar.
Year(s) Of Engagement Activity	2021


Description	Podcast
Form Of Engagement Activity	A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Interviewed for a video podcast series hosted by the company 47 degrees; this gave a good opportunity to engage with industry and spread the ideas of this research. The video has been watched 319 times at the time of writing.
Year(s) Of Engagement Activity	2021
URL	https://www.youtube.com/watch?v=6yswyRQGsRQ


Description	Presentation at Huawei Programming Languages group
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	Research talk to the local group but also streamed to the wider Huawei research teams and subsequently uploaded online:
Year(s) Of Engagement Activity	2022
URL	https://www.bilibili.com/video/BV1DA4y1Z734/?zw&spm_id_from=888.80996.embed_old

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications