Unifying Theories of Generic Programming

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

The world is increasingly dependent on robust, reliable software that meets its specification. At the same time, software has to be delivered at an increasingly faster rate. As a consequence, the software industry, especially in Europe and the UK, is facing a growing tension between productivity and reliability. Generic programming aims at relieving this tension.

The vision of this project is to develop a unifying theory of generic programming that will inform the design of future programming languages. Our goal is to produce a body of work that is in the same vein as the seminal work on Unifying Theories of Programming by Hoare and He: we consider the outcome of this project to be a unified foundation for generic programming that brings together the advantages of previous work into a coherent whole.

From the perspective of increasing programmer productivity, the importance of understanding and applying generic programming has never been so critical. Software engineers are constantly faced with the challenge of adapting to changing specifications and designs, while ensuring that the ensuing refactoring maintains the correctness of the algorithms they have crafted. Our approach to solving this problem - generic programming - exploits the inherent structure that exists in data, where this structure can be used to automatically produce efficient and flexible algorithms that can be adapted to suit different needs. Furthermore, generic programs ensure that the structure of the data itself plays a central role in maintaining the correctness of these algorithms.

The functional programming language Haskell offers rudimentary support for generic programming in the form of the deriving mechanism. Instead of manually coding, for example, equality for a datatype, the Haskell programmer attaches a simple hint to the datatype declaration which instructs the compiler to auto-generate equality and inequality for the datatype. Simple, convenient and robust. If the datatype is changed at a later point in time, equality and inequality are modified accordingly behind the scenes, supporting software evolution and easing software maintenance.

Haskell's support for generic programming is only partial, as the deriving mechanism is limited to a few predefined classes. In particular, one cannot define new derivable classes. This is exactly what generic programming supports. Informally, a derivable or generic function is defined by induction on the structure of types. The generic programmer provides code for a few type constructs, the rest is taken care of automatically. The generic program can then be instantiated many times at different types.

The last two decades have witnessed a number of approaches to generic programming differing in convenience, expressiveness and efficiency. We can roughly distinguish two main approaches: algebraic and type-theoretic ones. Both come with their various strengths and weaknesses. This project seeks to generalise and unify the two approaches, combining their individual strengths. It will do so using methods from a branch of mathematics called Category Theory. Furthermore, the project will explore novel approaches for the specification of generic programs, provide the infrastructure for reasoning about generic programs, and demonstrate that GP has far-reaching and important applications in practice.

Planned Impact

We have separated our project into three main research tracks: theory, reasoning and specification, and application. These roughly correspond to three principal beneficiaries of our work: language designers, academics and students, and practising software engineers.

Impact on Language Designers
----------------------------

The vision of our project is to develop a unifying theory of generic programming. We expect that, in the medium to long term, the proposed research will lead to fundamental advances in the design of programming languages and techniques for generic programming. This will also have an impact on practicing programmers, providing them with powerful new programming tools and techniques.

Impact on Academics and Students
--------------------------------

There has been much work on generic programming in the Mathematics of Program Construction, Algorithmic Languages and Calculi and Functional Programming communities, of which Gibbons and Hinze are established members; this background will contribute greatly to the work we propose. Conversely, the proposed research will have an impact on the three communities, providing new applications for existing techniques and opening new areas of research.

We plan to organise a Summer School on Unifying Theories of Generic Programming towards the end of the project. Summer Schools are widely recognised within academia as a good means of providing a wealth of information and training to students and established researchers alike. Besides helping us in disseminating our results to the community, a Summer School will also bring leading researchers throughout the world to the UK as lecturers, to the benefit of other UK PhD students.

Finally, the work supported by this project will make a vital contribution to the career development of the two research assistants.

Gibbons and Hinze hold positions in the Software Engineering Programme, which provides postgraduate courses for professional software engineers leading to MSc degrees in Software Engineering and Software & System Security. The Software Engineering Programme has immediate impact on industry, as it draws students from prominent corporate entities such as IBM, Samsung and Sony, as well as many smaller companies. Like for previous projects, suitable aspects of the work will be fed back into these courses.

Impact on Software Engineers
----------------------------

We expect that the proposed work has both short and medium term impact on software engineers. The application track will have a very direct impact on the Haskell community by creating efficient libraries for a multitude of datatype-generic problems. Haskell supports a variety of concurrent and parallel programming models, serving as an ideal testbed for creating high-performance parallel version of these libraries. Looking further out, we hope to leverage these techniques to build industrial-strength libraries for mainstream languages.

Publications

10 25 50
publication icon
Adams M (2014) Optimizing SYB is easy!

publication icon
Adams M (2015) Optimizing SYB traversals is easy! in Science of Computer Programming

publication icon
De Haas W (2013) Automatic Functional Harmonic Analysis in Computer Music Journal

publication icon
Gibbons J (2018) Relational algebra by way of adjunctions in Proceedings of the ACM on Programming Languages