Adaptive Just-In-Time Parallelisation (AJITPar)

Lead Research Organisation: University of Glasgow
Department Name: School of Computing Science

Abstract

A key element of the multicore software crisis is a lack of abstraction: most parallel code mixes coordination and computation, and assumptions about the architecture are often hard-coded, tying the code to a specific architecture. For problems with regular parallelism, i.e. where the number and size of subproblems can be statically predicted, good parallel performance on an architecture can be obtained using static techniques including static compilation. However, many problems are irregular and require dynamic techniques such as we propose.

This project aims to address the challenges of programs with irregular parallelism by sharing the burden of parallel coordination between the programmer and an adaptive runtime system. The approach can be summarised in the slogan "The programmer knows the problem, the runtime knows the hardware." That is, the programmer merely exposes the parallelism in the problem domain by means of architecture-independent declarative constructs, and the runtime system decides how to map the parallelism to a specific architecture. The project aims for a framework combining portable parallel performance across a range of architectures with "compile once, run anywhere" portable binaries. The project will test whether portable performance has been achieved by comparing the parallel performance of a suite benchmarks on several parallel architectures, from standard desktop workstations to high-end compute clusters.

The proposed research will advance the state-of-the-art in two areas.
(1) The development of runtime systems that tightly integrate dynamic scheduling of parallelism with dynamic trace-based just-in-time (JIT) compilation.
(2) The systematic investigation into how to specialise architecture-independent declarative parallelism to a specific architecture at runtime by means of declarative and modular code transformations.
Crucially, both novelties need to be combined; neither can deliver portable parallel performance on its own. The parallelising JIT runtime system may not find the "right" amount of parallelism without dynamically applying code transformations, and the transformations cannot in general be applied statically because they require knowledge of runtime timing data.

The research is timely in exploiting emergent trace-based JIT technology. Moreover, the project has enormous transformative potential. It investigates JIT parallelisation using Haskell because its pure functional context enables safe code transformation. If successful, however, the techniques established can be used to transform the parallel portability of software in many programming languages with JIT compilers, not least "functional second" languages like JavaScript, Python, Scala and C#. Moreover, a JIT compiler's innate ability to optimise traces spanning several layers of the software stack enables a parallelising JIT runtime to exploit potential parallelism uniformly across all software layers and components.

JIT compilers for languages like Java or JavaScript have spread widely in recent years, being deployed on a wide variety of architectures, from smart phones to desktop computers to web servers. Because of its transformative potential, the design and implementation of a parallelising JIT compiler for Haskell will be followed with interest by implementers of parallel languages in the UK and abroad, both academic and in industry, e.g. at Microsoft, Google, Mozilla.

This speculative project will be undertaken by an experienced and energetic team in a vibrant environment. The team will be led by Professor Phil Trinder, who has 20 year's experience in the field, delivering 13 successful research projects, and with over 100 publications. Dr. Maier contributes deep knowledge of parallel language implementation and program analysis.

Planned Impact

The applicants have an excellent track record in maximising research impact, and some relevant examples are as follows. The SCSCP protocol developed in a series of multidisciplinary parallel Computer Algebra projects has become a de facto standard with implementations in 9 Computer Algebra Systems (including Maple and MATLAB) and several languages (including Java, C/C++, Haskell). The HLDTS EPSRC project evaluated Erlang for telecommunications servers in conjunction with Motorola UK Research Labs. The project produced 6 conference publications and 2 journal papers; slides comparing C++ with Erlang have had 16K views on Reddit; several product groups in Motorola started using Erlang; the DM prototype we constructed became a mission-critical Motorola product; two Erlang evangelists were established within Motorola. HLDTS lead to the 3.6M Euro FP7 RELEASE project that aims to improve the scalability of Erlang on large commodity servers. The project, coordinated by Prof Trinder, engages both multinationals like Ericsson and EDF, but also a UK SME, Erlang Solutions.

The project will develop a prototype Haskell compiler, and the technology may be incorporated in GHC. This would convey considerable economic benefits as the user bases of both Haskell and GHC are substantial and growing extremely rapidly. Even greater impact would follow if ``functional second'' languages with very large developer communities, like JavaScript, Python, Scala, and C#, took up the technology.

Software is of strategic importance in a technological economy. An example is the Digital Economy, which already requires software that delivers good parallel performance on evolving parallel architectures. If successful the proposed research will help alleviate the parallel software crisis that is impinging on all software. Specifically, the technology has the potential to reduce the time and costs for development, and for porting between architectures, hence reducing the cost of products and their time to market.

UK society is increasingly dependent on a technological infrastructure that incorporates distributed and embedded software systems in an essential way; examples include cloud services (£2.4bn/year, set to grow to £6.1bn/year by 2014), telecommunication networks (£5bn/year) and mobile devices (£4bn/year). However, complexity is exploding with the increasing number, diversity, and parallel capabilities of the hardware architectures these software systems are deployed on. A unifying software framework, such as a portable bytecode format and JIT compiler ensuring performance-preserving portability between architectures, would help bring complexity back under control. If successful the proposed research will provide such a framework, thereby facilitating continued cost-effective infrastructure development and maintenance. This will help the UK maintain its European lead in tech infrastructure, benefiting society as a whole.

Our impact strategy is to engage both industry and academia, maintaining a web 2.0 presence facing both communities, e.g. hosting academic publications, releasing open source software, and blogging/slide-sharing in industrial fora. We will present and debate at academic conferences, at industrial fora like Techmeetup, through networks of excellence like HiPEAC, and via web technology, e.g. appropriate Google groups, email lists, wikis, etc. We will also publish in premier journals and via Heriot-Watt web pages and press releases. We will exploit our close personal links with the GHC team at Microsoft Research Labs in Cambridge to disseminate our results.

The project will train a PhD student to become an expert engineer on programming language support for manycore architectures. Such engineers are in extremely short supply and can have significant economic impact. Additionally, we will expose students to the AJITPar compiler in courses on parallel programming, thereby disseminating our results and training future developers.
 
Description We are investigating whether dynamic scheduling, cost models, and JIT technology can facilitate automatic adaption of software to a range of parallel architectures.

Our results show that the adaptions can improve performance across a range of multicore, NUMA and cluster architectures. We believe there is the potential to automatically adapt the program, but are currently using manual adaptions.
Exploitation Route We believe that the AJITPar techniques we are exploring could be applied to other programming languages with tracing JIT compilers. In particular we are exploring their application to the Python programming language, and specifically to the many scientific applications that use the SciPy software stack. To this end we have worked with Astrophysicists at Cambridge, Glasgow and NASA to identify suitable code bases to adapt with our techniques.
Sectors Aerospace, Defence and Marine,Digital/Communication/Information Technologies (including Software)

URL http://www.dcs.gla.ac.uk/~pmaier/AJITPar/
 
Description Network
Amount £91,619 (GBP)
Funding ID EP/P006434/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 11/2016 
End 10/2019
 
Description Session Types for Reliable Distributed Systems (STARDUST)
Amount £1,800,000 (GBP)
Funding ID EP/T014628/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 10/2020 
End 09/2024
 
Description Standard Research - NR1
Amount £294,007 (GBP)
Funding ID EP/M022641/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 03/2015 
End 02/2020
 
Title JIT Costing Adaptive Skeletons for Performance Portability 
Description  
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
 
Title Replicable Parallel Branch and Bound Search 
Description  
Type Of Material Database/Collection of data 
Year Produced 2017 
Provided To Others? Yes  
 
Title Transparent fault tolerance for scalable functional computation 
Description  
Type Of Material Database/Collection of data 
Year Produced 2015 
Provided To Others? Yes  
 
Title YewPar: Skeletons for Exact Combinatorial Search 
Description  
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
URL http://researchdata.gla.ac.uk/id/eprint/935
 
Description Microsoft Research 
Organisation Microsoft Research
Country Global 
Sector Private 
PI Contribution Presented current state of project and future plans at project review at Microsoft Research, Cambridge, 1 Oct 2015.
Collaborator Contribution Simon Peyton Jones (Microsoft Research) has provided advice and feedback from the project start in 2013. Simon Peyton Jones engaged in a 1-day project review at Microsoft Research offices in Cambridge on 1 Oct 2015.
Impact Outcome of project review: Ongoing development of precise semantics for skeletons transformations based on rewriting and expansion to task graphs.
Start Year 2013
 
Description SICSA Distinguished Visiting Fellowship 
Organisation Indiana University
Department School of Informatics, Computing and Engineering
Country United States 
Sector Academic/University 
PI Contribution Applied for funding for a SICSA Distinguished Visiting Fellowship: www.sicsa.ac.uk/funding/academics-postdoctoral-researchers/host-a-visitor/ Hosted Professor Sam Tobin-Hochstadt during his two week visit (22/1/17 - 5/2/17). Made arrangements for his delivery of 3 seminars (details below). Beyond the fellowship we retain close collaboration with Professor Tobin-Hochstadt on the development of the Pycket compiler.
Collaborator Contribution Deliver 3 seminars at St Andrews, Edinburgh, Glasgow Universities. Full details available at http://www.sicsa.ac.uk/wp-content/uploads/2017/03/Sam-Tobin-Hochstadt.pdf Prof. Tobin-Hochstadt's group lead the development of the Pycket compiler, which is a key software component of the AJITPar project.
Impact 3 research seminars at St Andrews, Edinburgh, Glasgow Universities. Development of JIT-based cost models for the Pycket compiler. Development of TCP communication for the Pycket compiler. Development of shared-memory interprocess communication for the Pycket and Racket compilers.
Start Year 2015
 
Description SICSA Distinguished Visiting Fellowship 
Organisation Indiana University
Country United States 
Sector Academic/University 
PI Contribution Applied for funding for a SICSA Distinguished Visiting Fellowship: www.sicsa.ac.uk/funding/academics-postdoctoral-researchers/host-a-visitor/ Hosted Professor Ryan Newton during his 10 day visit (11/6/14 - 20/6/14). Made arrangements for his delivery of 3 seminars and a Hackathon (details below). Beyond the fellowship we retain close collaboration with Professor Newton.
Collaborator Contribution Deliver 3 seminars at St Andrews, Edinburgh, Glasgow Universities, and help lead a Hackathon at Heriot-Watt University. Full details available at http://www.sicsa.ac.uk/wp-content/uploads/2014/03/Professor-Ryan-R-Newton.pdf
Impact 3 research seminars at St Andrews, Edinburgh, Glasgow Universities A Hackathon to help develop a common open source code base for distributed-memory Haskell.
Start Year 2014
 
Title Fork of Pycket compiler, augmented with JIT-based cost model 
Description Fork of the Pycket compiler (master repository: https://github.com/samth/pycket) This version of Pycket has been extended with the JIT-based cost model developed in the AJITPar project. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact Used in two AJITPar publications. 
URL https://github.com/magnusmorton/pycket/tree/runtime_trace_analysis
 
Title Haskell Reference Implementation of Maximal Clique for SICSA Multicore Challenge III 
Description Reference implementations of the maximal clique graph search benchmark for phase III of the international SICSA Multicore Challenge: www.macs.hw.ac.uk/sicsawiki/index.php/Challenge_PhaseIII The challenge invites developers to compare multicore language technologies on common problems and platforms: www.macs.hw.ac.uk/sicsawiki/index.php/MultiCoreChallenge 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact Used as part of the ongoing SICSA Multicore Challenge. 
URL http://www.macs.hw.ac.uk/sicsawiki/index.php/PhaseIII_Haskell_Versions
 
Description CEFP 2015 talk (Magnus) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Talk "Costing JIT Traces for Adaptive Parallelism", Central-European Functional Programming School, Budapest, 8 July 2015

This talk presented a technique for building cheap-to-compute cost models by piggybacking on the traces collected by a just-in-time compiler.
Year(s) Of Engagement Activity 2015
URL http://people.inf.elte.hu/cefp/
 
Description EuroPar 2014 talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Talk "High-Performance Computer Algebra: A Hecke Algebra Case Study" at EuroPar 2014, Porto, 27 August 2014.

The talk presented a case study in parallel computational algebra, including results of running computational algebra experiments on a supercomputer. It sparked discussions on why there are so few large scale parallel computational algebra experiments, and why current computer architectures favour numerical rather than algebraic or symbolic algorithms.

Not aware of any impact.
Year(s) Of Engagement Activity 2014
URL http://europar2014.dcc.fc.up.pt/
 
Description FHPC 2016 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Talk "JIT Costing Adaptive Skeletons for Performance Portability", 5th International Workshop on Functional High-Performance Computing, Nara, Japan, September 22, 2016.

This talk demonstrated the suitability for JIT-based cost models for guiding program transformations to adapt parallelism.
Year(s) Of Engagement Activity 2016
URL https://sites.google.com/site/fhpcworkshops/fhpc-2016/call-for-participation-txt/callforparticipatio...
 
Description Haskell 2014 talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Talk "The HdpH DSLs for Scalable Reliable Computation", Haskell Symposium 2014, Gothenburg, 4 Sep 2014.

The talk presented two domain-specific languages for parallel programming on large-scale distributed-memory architectures, one providing fault tolerance, the other offering the programmer a highly abstract mathematical model of non-uniform communication latency. The talk sparked a number of discussions with fellow researchers on the potential suitability of such an abstract model of latency in their domains.

Not aware of any impact.
Year(s) Of Engagement Activity 2014
URL https://www.haskell.org/haskell-symposium/2014/
 
Description Haskell Implementors 2014 talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Health professionals
Results and Impact Talk "The Implementation of the HdpH DSLs: Details and Difficulties", Haskell Implementors Workshop 2014, Gothenburg, 6 Sep 2014.

The talk presented details on the implementation of serialisable closure in the HdpH DSLs. The talk led to ongoing email discussions with fellow researchers about the safest and most efficient ways to implement serialisable closures in Haskell.

The infrastructure provided by GHC (the standard Haskell compiler) for supporting serialisable closures is currently undergoing a design review. Ideas presented in the talk have been taken on board, and are currently being refined in email discussions and blog posts.
Year(s) Of Engagement Activity 2014
URL https://www.haskell.org/haskellwiki/HaskellImplementorsWorkshop/2014/
 
Description IFL 2015 talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Talk "Towards An Adaptive Framework For Performance Portability", 27th Symposium on Implementation and Application of Functional Languages, Koblenz, 16 Sep 2015

The talk presented progress towards the adaptive parallel programming framework developed in the AJITPar project, specifically progress on a distributed task scheduler and on light-weight cost models based on JIT-traces. The talk sparked discussions with fellow researchers on scheduling policies to reduce latency.
Year(s) Of Engagement Activity 2015
URL http://ifl2015.wikidot.com/
 
Description Meetings with computational scientists 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Presented AJITPar results to physists and chemists at Glasgow University, 11 Jan 2016.

Discussed the needs of computational scientists for a system similar to AJITPar, specifically for scientific code written in Python. Identified potential candidate case studies. Joint exploration of these candidates is ongoing, and has resulted in several follow up meetings.
Year(s) Of Engagement Activity 2016
 
Description Presentation at Trends in Functional Programming (TFP'18) in Canterbury, UK 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact This was a research symposium presentation where we presented the Communications Cost Model that we developed in the AJITPar project. TITLE: A Communications Cost Model for a Distributed Scheme.
ABSTRACT. AJITPar is a distributed, parallel framework for the Pycket
Scheme implementation, which dynamically adjusts parallel code based
on runtime execution time predictions. This system requires a lightweight
model for predicting This paper presents a communications cost model
for the AJITPar system; and demonstrates its additive property. We
describe the iterative development of the model and use statistical modelling
techniques to infer a final cost model. The final model costs both
(de)serialisation and network send time. (De)serialisation time is shown
to dominate the communications overhead by an order of magnitude on
a typical cluster.
Year(s) Of Engagement Activity 2018
URL https://www.cs.kent.ac.uk/events/tfp17/schedule.html
 
Description RAC 2016 (Magnus) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Talk "JIT-Based Cost Analysis for Dynamic Program Transformations", Resource Aware Computing Workshop, Eindhoven, The Netherlands, April 2, 2016.

This talk presented JIT-based cost models and demonstrated that these models accurately predict the cost program transformations.
Year(s) Of Engagement Activity 2016
URL http://resourceanalysis.cs.ru.nl/rac2016/
 
Description RWDSL 2014 talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Talk "The Design of the AJITPar Parallel Coordination DSL", Workshop on Real World Domain Specific Languages, Heriot-Watt, 1 May 2014

The talk outlined the design of the task parallel coordination DSL used in the AJITPar project. The event sparked discussions on the nature of domain-specific languages and led to the planning of an international workshop on Real World Domain Specific Languages, to be held in 2016 in Barcelona. Prof Trinder and Dr Maier were invited to serve on the program committee.
Year(s) Of Engagement Activity 2014
URL http://www.macs.hw.ac.uk/~rs46/dsl-workshop/
 
Description SPLS 2015 talk (Magnus) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Talk "Costing JIT Traces for Adaptive Parallelism", Scottish Programming Languages Seminar, St Andrews, 15 June 2015

This talk presented a technique for building cheap-to-compute cost models by piggybacking on the traces collected by a just-in-time compiler.
Year(s) Of Engagement Activity 2015
URL https://ff32.host.cs.st-andrews.ac.uk/spls/
 
Description SPLS June 2016 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Other audiences
Results and Impact Talk "JIT Cost Analysis for Adaptive Skeletons", Scottish Programming Languages Seminar, Heriot-Watt, 22 June 2016

This talk demonstrated the usefulness of trace-based cost models for guiding program transformations to adapt parallelism.
Year(s) Of Engagement Activity 2016
URL http://www.macs.hw.ac.uk/~rs46/spls-hwu-june2016/
 
Description TFP 2015 talk (Magnus) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Talk "Costing and Transforming JIT Traces for Adaptive Parallelism", Symposium on Trends in Functional Programming, Sophia Antipolis, 3 June 2015

This talk presented a technique for building cheap-to-compute cost models by piggybacking on the traces collected by a just-in-time compiler.
Year(s) Of Engagement Activity 2015
URL https://tfp2015.inria.fr/