COOLER: COmpOsing LanguagE Runtimes

Lead Research Organisation: King's College London
Department Name: Informatics

Abstract

Traditionally, most software projects have been tackled using a single programming language. However, as our ambitions for software grow, this is increasingly unnatural: no single language, no matter how "good", is well-suited to everything. Increasingly, different communities have created or adopted non-traditional languages - often, though not always, under the banner of Domain Specific Languages (DSLs) - to satisfy their specific needs.

Consider a large organisation. Its back-end software may utilise SQL and Java; its desktop software C#; its website back-end PHP and the front-end Javascript and HTML5; reports may be created using R; and some divisions may prototype software with Python or Haskell. Though the organisation makes use of different languages, each must execute in its own silo. We currently have few techniques to allow a single running program to be written using multiple languages. In the Cooler project, we call this the "runtime composition" problem: how can languages execute directly alongside each other, exchange data, call each other, optimise with respect to each other, etc.?

The chief existing technique for composing language runtimes is to translate all languages in the composition down to a base language, most commonly the byte code for one of the "big" Virtual Machines (VMs) - Java's HotSpot or .NET's CLR. Though this works well in some cases, it has two major problems. Firstly, a VM will intentionally target a specific family of languages, and may not provide the primitives needed by languages outside that family. HotSpot, for example, does not support tail recursion or continuations, excluding many advanced languages. Secondly, the primitives that a VM exposes may not allow efficient execution of programs. For example, dynamically typed languages running on HotSpot run slower than their seemingly much less sophisticated "home brew" VMs.

The Cooler project takes a new approach to the composition problem. It hypothesizes that meta-tracing will allow the efficient composition of arbitrary language runtimes. Meta-tracing is a recently developed technique that creates efficient VMs with custom Just-in-Time (JIT) compilers. Firstly, language designers write an interpreter for their chosen language. When that interpreter executes a user's program, hot paths in the code are recorded ("traced"), optimised, and converted into machine code; subsequent calls then use that fast machine code rather than the slow interpreter. Meta-tracing is distinct from partial evaluation: it records actual actions executed by the interpreter on a specific user program. Meta-tracing is an exciting new technique for three reasons. Firstly, it leads to fast VMs: the PyPy VM (a fully compatible reimplementation of Python) is over 5 times faster than CPython (the C-based Python VM) and Jython (Python on the JVM). Secondly, it requires few resources: a meta-tracing implementation of the Converge language was completed in less than 3 person months, and runs faster than CPython and Jython. Third, because the user writes the interpreter themselves, there is no bias to any particular family of languages.

The Cooler project will initially design the first language specifically designed for meta-tracing (rather than, as existing systems, reusing an unsuitable existing language). This will enable the exploration of various aspects of language runtime composition. First, cross-runtime sharing: how can different paradigms (e.g. imperative and functional) exchange data and behaviour? Second, optimisation: how can programs written in multiple paradigms be optimised (space and time)? Finally, the limits of the approach will be explored through known hard problems: cross-runtime garbage collection; concurrency; and to what extent runtimes not designed for composition can be composed. Ultimately, the project will allow users to compose together runtimes and programs in ways that are currently unfeasible.

Planned Impact

Programming languages power much of our society, and much of our research, both directly and indirectly. The Cooler project's contributions will thus benefit several other groups. We identify two in particular.

The first group are language designers. Currently, language designers face an unappealing trade-off when designing a new language: spend most of the time on design and get a slow implementation; or spend most of the time on the implementation and get a bad design. Creating a fast language implementation traditionally takes tens of man years (at a minimum), far beyond that which a small team can manage. Slow implementations have problems beyond just making programs slow to run: new language designs are typically ignored when they are part of a slow implementation. In other words, the community as a whole tends to give little credence to language design features until they can be shown to run adequately fast. The more exotic the design, the more the community needs the corresponding implementation to be fast.

The release of Mammal will allow language designers to produce "fast enough" meta-tracing VM implementations with significantly less effort than current approaches. This will allow new language ideas to be evaluated on their merits rather than the size of the team that produced the accompanying implementation. We hope that this will encourage a new age of experimentation in language design, similar to the rapid progress of the 1960s, which may ultimately lead to new solutions for challenges to which current languages struggle with (e.g. concurrency).

To reach language designers, we will focus on community appearances and publicly available and accessibly written articles, ranging from blog-type articles to traditional research papers. We will also hold a summer school towards the end of the project to engage with interested parties.

Software developers form the second group of beneficiaries. Mammal will lead to an increase in the number of languages with fast implementations. This is likely to increase the current trend for developers to experiment with 'non-mainstream' languages (e.g. Clojure, Scala), encouraging a wider understanding of the pros and cons of different languages and paradigms.

In the long-term, Cooler will open up a fundamentally new way of implementing systems which have diverse software needs. For example, health systems have multiple aspects (front-ends, reporting systems, storage etc.), each of which is a substantial system in and of itself. A GP's front-end system, amongst other things, collects input from the GP (increasingly using web-based systems), queries various databases, and provides various reports. Each component of this workflow might be most naturally implemented using a different language (perhaps Javascript, C# / Linq, R), yet developers have to choose one language and use it for all aspects. Cooler will enable language implementers to provide runtimes which can be composed with others. Software developers will then use this ability to implement systems in novel ways.

Publications

10 25 50
publication icon
Barrett E (2017) Virtual machine warmup blows hot and cold in Proceedings of the ACM on Programming Languages

publication icon
Barrett E (2015) Approaches to interpreter composition in Computer Languages, Systems & Structures

publication icon
Bauman S (2017) Sound gradual typing: only mostly dead in Proceedings of the ACM on Programming Languages

publication icon
Bauman S (2015) Pycket: a tracing JIT for a functional language in ACM SIGPLAN Notices

publication icon
Pape T (2017) Record data structures in racket usage analysis and optimization in ACM SIGAPP Applied Computing Review

 
Description Language composition has the potential to radically change how people can evolve their software. Previously it was thought difficult to compose small languages and impossible to compose large languages. In this project, we showed not only that large programming languages be composed together, but that the result can run fast enough to be usable for real world activities. Indeed, we were able to get far further than we expected in this regard, ultimately releasing the PyHyp language composition, the first ever fine-grained composition of two large languages (Python and PHP). This opens this are up for exploration by others.
Exploitation Route By opening up the field of language composition, it is likely that both academics and industry will start to fill in the remaining blanks. We have already seen publications from the Hasso-Plattner-Institut in Germany which directly build upon our research artefacts, and I expect this to continue and increase in volume over the next few years.
Sectors Digital/Communication/Information Technologies (including Software),Financial Services, and Management Consultancy,Other

URL http://tratt.net/laurie/blog/entries/fine_grained_language_composition.html
 
Title Krun 
Description Krun is a system for rigorously running software benchmarking. It controls more confounding variables than any previous tool. 
Type Of Material Improvements to research infrastructure 
Year Produced 2017 
Provided To Others? Yes  
Impact Too soon to say. 
URL http://soft-dev.org/src/krun/
 
Description Visualizing Cross-Language Execution 
Organisation Oracle Corporation
Department Oracle Corporation UK Ltd
Country United Kingdom 
Sector Private 
PI Contribution I am the PI on this project investigating the visualisation of program execution in the face of language composition.
Collaborator Contribution We have regular meetings with two members of Oracle Labs; they have provided ideas, and data to validate our ideas.
Impact Not currently applicable.
Start Year 2016
 
Title Krun 
Description Krun is a state-of-the-art software benchmark runner. It rigorously benchmarks software, controlling more confounding variables than any previous tool. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact Too early to say. 
 
Title PyHyp 
Description PyHyp is the first large-scale language composition of two real-world languages: in this case PHP and Python. It consists of a fast virtual machine to execute PyHyp programs and special editor support via the Eco tool to make writing composed programs plausible. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact Too early to state. 
URL http://soft-dev.org/pubs/files/pyhyp/
 
Title SQPyte 
Description SQPyte takes the widely used SQLite database and converts part of it into a fast meta-tracing virtual machine. When called regularly from a programming language, SQPyte is generally significantly faster than SQLite. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact The SQLite project contacted us to find out what they can learn from SQPyte. This discussion is ongoing. 
URL http://soft-dev.org/pubs/files/sqpyte/
 
Title Unipycation 
Description A proof-of-concept composition of Python and Prolog. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact Research prototype. 
URL http://soft-dev.org/src/unipycation/
 
Title libkalibera 
Description libkalibera contains reimplementations of the benchmarking method method the following two papers: * Rigorous benchmarking in reasonable time. Tomas Kalibera, Richard Jones * Quantifying performance changes with effect size confidence intervals. Tomas Kalibera, Richard Jones libkalibera started off as a pure Python module by the Cooler project team, but support for other languages has since been added by other people. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact libkalibera has been added to and used by other authors e.g.: Dynamically Composing Languages in a Modular Way: Supporting C Extensions for Dynamic Languages. Matthias Grimmer, Chris Seaton, Thomas Wuerthinger, Hanspeter Moessenboeck. Modularity 2015 (to appear) 
URL http://soft-dev.org/src/libkalibera/