Generic and Indexed Programming

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

Generic programming is about increasing the flexibility of programswithout compromising on safety. This is typically achieved throughintroducing new kinds of parametrization. A parametrized programabstracts from a family of different but related programs, which canbe retrieved by different instantiations of the parameter. A typesystem for the parameters allows the programmer to express constraintson parameters, and to statically check both the parametrized genericprogram and its specific instantiations. Examples of this notioninclude higher-order programming (parametrization by a function),parametric polymorphism (by a type, used uniformly), ad-hocpolymorphism (by a type, with some constraints), abstract datatypes(by an interface), datatype-genericity (by the shape of data), and soon.Our recent work has concentrated on one instance of this scheme,namely datatype-generic programming, in which programs areparametrized by type constructors, for which suitable instantiationsare lists of and trees of . We have investigated both programmingtechniques (including reasoning about generic programs, and using themto capture design patterns precisely), and language mechanisms(particularly lightweight approaches: patterns for simulatinghighly-expressive techniques in familiar but apparentlyless-expressive languages).Our recent experience has led us to a number of insights; this currentproposal seeks support to exploit them. Firstly, these lightweighttechniques, which we have to date been embedding in Haskell's stillrelatively expressive type system, are in fact applicable to even lessexpressive but more popular mainstream languages such as Java and C#.Secondly, the techniques have more applications than we firstthought; in particular, they offer a solution to the so-called expression problem , and other challenges involving extensibility:safe combination of independent extensions along multipledimensions.Thirdly, new developments in languages, such as Haskell'sgeneralized algebraic datatypes (GADTs), are leading us towards whatwe might call indexed programming: values are used to capture andstatically verify properties of programs. As a simple example, atype of fixed-length vectors is indexed by the length; moreelaborate examples involve a type of finite automata indexed bytheir states, and a type of type-safe interpreters indexed by thetype of expression they interpret. Indexed programming is related to our previous work on genericprogramming in two ways: it directly supports a flavour of genericprogramming, indexing programs by a representation of types; and it isanother lightweight approach, a kind of lightweight dependently-typedprogramming, providing many of the benefits with only a fraction ofthe costs - rather than having to resort to full-blown theorem provingand entangling notions of compile-time and run-time , the indicescan be reflected at the type level and checked using only mildextensions to existing type checking algorithms.We propose six packages of work implementing this vision: case studieson programming with indices; using McBride and Paterson's applicativefunctors to extend the Scrap your Boilerplate approach to genericprogramming to apply also to GADTs; applying GADTs to express safeextensible functions and datatypes, solving a generalization ofWadler's expression problem; translating these results to mainstreamlanguages such as Java and C#, and using them to implement a libraryof reusable code capturing some of the Gang of Four design patterns;providing mechanisms for statically inferring witnesses to propertypropositions, saving the programmer from having to invent andmanipulate them; investigating the use of indexed programmingtechniques for integrating the languages used in different tiers ofenterprise architectures, aiming for type-safe database and XMLaccess.

Publications

10 25 50
publication icon
A Rodriguez (2010) Comparing Datatype Generic Libraries in Haskell in Journal of Functional Programming

publication icon
Danielsson N (2006) Fast and loose reasoning is morally correct in ACM SIGPLAN Notices

publication icon
Gibbons J (2007) Metamorphisms: Streaming representation-changers in Science of Computer Programming

publication icon
GIBBONS J (2009) The essence of the Iterator pattern in Journal of Functional Programming

publication icon
Gibbons J (2006) Unbounded Spigot Algorithms for the Digits of Pi in The American Mathematical Monthly

publication icon
Gibbons J (2011) Just do it

publication icon
Gibbons J (2011) Maximum Segment Sum, Monadically (distilled tutorial) in Electronic Proceedings in Theoretical Computer Science

 
Description Generic programming is about increasing the flexibility of programs without compromising on safety. This is typically achieved through introducing new kinds of parametrization. A parametrized program abstracts from a family of different but related programs, which can be retrieved by different instantiations of the parameter. A type system for the parameters allows the programmer to express constraints on parameters, and to statically check both the parametrized generic program and its specific instantiations. Examples of this notion include higher-order programming (parametrization by a function), parametric polymorphism (by a type, used uniformly), ad-hoc polymorphism (by a type, with some constraints), abstract datatypes (by an interface), datatype-genericity (by the shape of data), and so on. A particular kind of generic programming is "indexed programming", whereby program types are parametrized by values, which capture and statically verify properties of programs. As a simple example, a type of fixed-length vectors is indexed by the length; more elaborate examples involve a type of finite automata indexed by their states, and a type of type-safe interpreters indexed by the type of expression they interpret. Conversely, indexed programming directly supports a flavour of datatype-generic programming, in which programs are indexed by a representation of types



One strand of work in this project concerned the use of generic and indexed programming techniques in the development of support for safe extensible programming. The challenge has been called the "Expression Problem", and can be characterized in terms of the independent statically-safe extension of software modules in multiple dimensions - such as in extending datatypes and operations. A second strand has been focussed on Bidirectional Programming - the transformation of data between a "source" and a "view" format, with updates on the view, without having to write and maintain two separate transformation programs.
Sectors Digital/Communication/Information Technologies (including Software)

URL http://www.cs.ox.ac.uk/projects/gip/index.html