Reliable Many-Core Programming
Lead Research Organisation:
Imperial College London
Department Name: Computing
Abstract
The computational demands of modern computer applications make the
pursuit of high performance more critical than ever, and mobile,
battery-powered devices, as well as concerns related to climate
change, require high performance to co-exist with energy-efficiency.
Due to physical limits, the traditional means for improving hardware
performance by increasing processor frequency now carries an
unacceptably high energy cost. Advances in processor fabrication
technology instead allow the construction of many-core processors,
where hundreds or thousands of processing elements are placed on a
single chip, promising high performance and energy-efficiency through
sheer volume of processing elements.
Many-core devices are present in practically all consumer devices,
including smartphones and tablets. As a result, the general public in
developed countries interact with many-core software daily. Many-core
technology is also used to accelerate safety-critical software in
domains such as medical imaging and autonomous vehicle navigation.
It is thus important that many-core software should be reliable. This
requires reliable software from programmers, but also a reliable
"stack" to support this software, including compilers that allow
software to execute on many-core devices, and the many-core devices
themselves. Recent work on formal verification and testing by myself
and other researchers has identified serious technical problems
spanning the many-core stack. These problems undermine confidence in
applications of many-core technology: defective many-core software
could risk fatal accidents in critical domains, and impact negatively
on users in other important application areas.
My long-term vision is that the reliability of many-core programming
can be transformed through breakthroughs in programming language
specification, formal verification and test case generation, enabling
automated tools to assist programmers and platform vendors in
constructing reliable many-core applications and language
implementations. The aim of this five-year Fellowship is to undertake
foundational research to investigate a number of open problems whose
solution is key to enabling this long-term vision.
First, I seek to investigate whether it is possible to precisely
express the intricacies of many-core programming language using formal
mathematics, providing a rigorous basis on which software and language
implementations can be constructed.
Second, I aim to tackle several open problems that stand in the way of
effective formal verification of many-core software, which would allow
developers to obtain strong guarantees that such software will operate
as required.
Third, I will investigate raising this level of rigour beyond
many-core languages. A growing trend is for applications to be written
in relatively simple, high-level representations, and then
automatically translated into high-performance many-core code. This
translation process must preserve the meaning of programs; I will
investigate methods for formally certifying that it does.
Fourth, I will formulate new methods for testing many-core language
implementations, exploiting the rigorous language definitions brought
by my approach to enable high test coverage of subtle language
features.
Collectively, progress on these problems promises to enable a
*high-assurance* many-core stack. I will demonstrate one instance of
such a stack for the industry-standard OpenCL language and the PENCIL
high-level language, showing that high-level PENCIL programs can be
reliably compiled into rigorously-defined OpenCL, integrated with
verified library components, and deployed on thoroughly tested
implementations from many-core vendors.
Partnership with four leading many-core technology vendors, AMD, ARM,
Imagination Technologies and NVIDIA, provides excellent opportunities
for the advances the Fellowship makes to have broad industrial impact.
pursuit of high performance more critical than ever, and mobile,
battery-powered devices, as well as concerns related to climate
change, require high performance to co-exist with energy-efficiency.
Due to physical limits, the traditional means for improving hardware
performance by increasing processor frequency now carries an
unacceptably high energy cost. Advances in processor fabrication
technology instead allow the construction of many-core processors,
where hundreds or thousands of processing elements are placed on a
single chip, promising high performance and energy-efficiency through
sheer volume of processing elements.
Many-core devices are present in practically all consumer devices,
including smartphones and tablets. As a result, the general public in
developed countries interact with many-core software daily. Many-core
technology is also used to accelerate safety-critical software in
domains such as medical imaging and autonomous vehicle navigation.
It is thus important that many-core software should be reliable. This
requires reliable software from programmers, but also a reliable
"stack" to support this software, including compilers that allow
software to execute on many-core devices, and the many-core devices
themselves. Recent work on formal verification and testing by myself
and other researchers has identified serious technical problems
spanning the many-core stack. These problems undermine confidence in
applications of many-core technology: defective many-core software
could risk fatal accidents in critical domains, and impact negatively
on users in other important application areas.
My long-term vision is that the reliability of many-core programming
can be transformed through breakthroughs in programming language
specification, formal verification and test case generation, enabling
automated tools to assist programmers and platform vendors in
constructing reliable many-core applications and language
implementations. The aim of this five-year Fellowship is to undertake
foundational research to investigate a number of open problems whose
solution is key to enabling this long-term vision.
First, I seek to investigate whether it is possible to precisely
express the intricacies of many-core programming language using formal
mathematics, providing a rigorous basis on which software and language
implementations can be constructed.
Second, I aim to tackle several open problems that stand in the way of
effective formal verification of many-core software, which would allow
developers to obtain strong guarantees that such software will operate
as required.
Third, I will investigate raising this level of rigour beyond
many-core languages. A growing trend is for applications to be written
in relatively simple, high-level representations, and then
automatically translated into high-performance many-core code. This
translation process must preserve the meaning of programs; I will
investigate methods for formally certifying that it does.
Fourth, I will formulate new methods for testing many-core language
implementations, exploiting the rigorous language definitions brought
by my approach to enable high test coverage of subtle language
features.
Collectively, progress on these problems promises to enable a
*high-assurance* many-core stack. I will demonstrate one instance of
such a stack for the industry-standard OpenCL language and the PENCIL
high-level language, showing that high-level PENCIL programs can be
reliably compiled into rigorously-defined OpenCL, integrated with
verified library components, and deployed on thoroughly tested
implementations from many-core vendors.
Partnership with four leading many-core technology vendors, AMD, ARM,
Imagination Technologies and NVIDIA, provides excellent opportunities
for the advances the Fellowship makes to have broad industrial impact.
Planned Impact
The project has a large set of potential beneficiaries. The ultimate
beneficiaries of my long-term vision for reliable many-core
programming are end-users of software that employs many-core
acceleration. Because many-core hardware now ships with practically
all desktop, laptop and mobile devices, society is starting to
interact with many-core software on a daily basis. By leading to
improvements in the robustness of this software, and the platforms on
which it operates, my proposed project offers significant societal
benefit. Many-core technology is increasingly employed in
safety-critical contexts, for example it has found application in
medical imaging, pedestrian detection, and autonomous navigation for
self-driving cars. Reliability improvements in these areas offer to
make society safer. The rise of many-core technology also presents
new attack vectors that pose cyber-security and cyber-terrorism risks;
the advanced reasoning techniques that my research will bring will
allow many-core software to be rigorously engineered to defend against
these threats.
The fundamental advances made during the project will have long-term
academic impact, yielding results that researchers in several fields
can exploit. See the "Academic Beneficiaries" form for more details.
My Fellowship has very high potential for industrial impact.
Many-core platform vendors, including my project partners AMD, ARM,
Imagination Technologies and NVIDIA and several other companies, will
be able to exploit the Fellowship research to improve many-core
language implementations, increasing their market value. There is
potential for commercial exploitation of the research outcomes by tool
providers, and the wide range of software developers who apply
many-core acceleration in diverse domains will be able to build more
reliable software using the advanced analyses the Fellowship will
bring. All these stakeholders will benefit from the transformation in
the rigour of many-core programming language design offered by my
approach. Specifically, the 80+ industrial contributing members of
the Khronos Group will be beneficiaries of improvements to OpenCL and
related APIs that arise from the project. The large and growing
market associated with many-core technology means that uptake by
industry has the potential to lead to corresponding economic impact.
Close collaboration with industry during the Fellowship will give rise
to industrial impact on the Fellowship partners early during the
five-year project, and my strong impact plans will help to broaden
this impact to other industrial beneficiaries during the project
lifetime, leading to widespread industrial adoption over a 5-10-year
period. This widespread adoption has the potential to transform
many-core programming, impacting on industrial practice in the long
term and benefiting society for decades to come.
beneficiaries of my long-term vision for reliable many-core
programming are end-users of software that employs many-core
acceleration. Because many-core hardware now ships with practically
all desktop, laptop and mobile devices, society is starting to
interact with many-core software on a daily basis. By leading to
improvements in the robustness of this software, and the platforms on
which it operates, my proposed project offers significant societal
benefit. Many-core technology is increasingly employed in
safety-critical contexts, for example it has found application in
medical imaging, pedestrian detection, and autonomous navigation for
self-driving cars. Reliability improvements in these areas offer to
make society safer. The rise of many-core technology also presents
new attack vectors that pose cyber-security and cyber-terrorism risks;
the advanced reasoning techniques that my research will bring will
allow many-core software to be rigorously engineered to defend against
these threats.
The fundamental advances made during the project will have long-term
academic impact, yielding results that researchers in several fields
can exploit. See the "Academic Beneficiaries" form for more details.
My Fellowship has very high potential for industrial impact.
Many-core platform vendors, including my project partners AMD, ARM,
Imagination Technologies and NVIDIA and several other companies, will
be able to exploit the Fellowship research to improve many-core
language implementations, increasing their market value. There is
potential for commercial exploitation of the research outcomes by tool
providers, and the wide range of software developers who apply
many-core acceleration in diverse domains will be able to build more
reliable software using the advanced analyses the Fellowship will
bring. All these stakeholders will benefit from the transformation in
the rigour of many-core programming language design offered by my
approach. Specifically, the 80+ industrial contributing members of
the Khronos Group will be beneficiaries of improvements to OpenCL and
related APIs that arise from the project. The large and growing
market associated with many-core technology means that uptake by
industry has the potential to lead to corresponding economic impact.
Close collaboration with industry during the Fellowship will give rise
to industrial impact on the Fellowship partners early during the
five-year project, and my strong impact plans will help to broaden
this impact to other industrial beneficiaries during the project
lifetime, leading to widespread industrial adoption over a 5-10-year
period. This widespread adoption has the potential to transform
many-core programming, impacting on industrial practice in the long
term and benefiting society for decades to come.
Organisations
- Imperial College London (Fellow, Lead Research Organisation)
- Google (Collaboration)
- RWTH Aachen University (Collaboration)
- French Institute for Research in Computer Science and Automation (Project Partner)
- ARM (United Kingdom) (Project Partner)
- Imagination Technologies (United Kingdom) (Project Partner)
- Advanced Micro Devices (United States) (Project Partner)
- Nvidia (United States) (Project Partner)
People |
ORCID iD |
Alastair Donaldson (Principal Investigator / Fellow) |
Publications

Donaldson A
(2017)
Automated testing of graphics shader compilers
in Proceedings of the ACM on Programming Languages

Lidbury C
(2017)
Dynamic race detection for C++11

Sorensen T
(2017)
Cooperative kernels: GPU multitasking for blocking algorithms

Sorensen T
(2018)
GPU schedulers: how fair is fair enough?
Description | The main finding is that device drivers for many-core processors can be effectively tested through the use of transformations that preserve the meanings of programs. This led to the successful research behind the GraphicsFuzz spin-out company that was acquired by Google in 2018. |
Exploitation Route | The open source GraphicsFuzz project (http://github.com/google/graphicsfuzz) can be used by the GPU industry. |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | GraphicsFuzz, a tool that automatically finds bugs in compilers for graphics processing units (GPUs), is based on fundamental advances in metamorphic testing pioneered by Donaldson's Multicore Programming research group at Imperial College London, with funding from the Reliable Many-Core Programming Fellowship. GraphicsFuzz has found serious, security-critical defects in compilers from all major GPU vendors, leading to the spin-out of GraphicsFuzz Ltd., to commercialise the research. GraphicsFuzz Ltd. was acquired by Google in 2018 for an undisclosed sum, establishing a new London-based team at Google, led by Donaldson, to focus on Android graphics driver quality. Google have open-sourced the GraphicsFuzz tool, and it is now being used routinely to find serious defects in graphics drivers that affect the Android operating system (estimated 2.5 billion active devices worldwide) and Chrome web browser (used by an estimated 66% of all internet users). The global GPU market is predicted to grow to grow from $20 bn in 2019 to $200 bn in 2027, so the industrial uptake of this tools - based on fundamental research from Imperial College London - is having major impact on industry, making GPU software safer and more secure for end users, and reducing development costs of GPU software and drivers. Impact of the GraphicsFuzz tool: The original GLFuzz tool was highly successful in applying metamorphic testing to finding bugs in GLSL compilers from all seven major GPU providers. Donaldson disseminated this to a wide audience by writing a series of blog posts describing these findings (AMD post, Apple post, ARM post, Imagination Technologies post, Intel post, NVIDIA post, Qualcomm post). These bugs included three serious security vulnerabilities for which the Imperial team received public recognition - an Apple iOS information leak, an NVIDIA kernel mode vulnerability, and data-capture exploit caused by a defect in an ARM graphics driver. Donaldson and team promoted GraphicsFuzz on social media, reaching the front page of HackerNews and being featured by sites such as Phoronix. Acquisition of GraphicsFuzz Ltd. by Google, and open sourcing. Encouraged by this media interest, Donaldson and his post-docs Hugues Evrard and Paul Thomson secured funding from the DCMS/Innovate UK ICURe programme to investigate commercialisation opportunities for the work, leading to them founding GraphicsFuzz Ltd., incorporated in December 2017 and spun-out from Imperial in February 2018. The team promoted the technology via an innovative web-app that would run a set of GraphicsFuzz tests on a user's device, allowing them to tweet a summary of the test results. This raised the profile of GraphicsFuzz quickly, leading to the acquisition of the company by Google in August 2018 for an undisclosed sum. Google open-sourced the GraphicsFuzz tool in September 2018. Ongoing impact of GraphicsFuzz on graphics driver quality: Donaldson, Evrard and Thomson joined Google UK, forming a team specialising in methods to improve the quality of Android graphics device drivers, deployed on billions of devices worldwide. GraphicsFuzz is being used to find defects in Android graphics drivers from ARM, Qualcomm, Imagination Technologies and NVIDIA, who collectively produce GPUs for more than 79% of the smartphone and tablet GPU market. Tests that expose these bugs are being integrated with the Android Compatibility Test Suite, meaning that devices can only ship future versions of Android if these bugs are fixed in their drivers. Impact on Chrome: The WebGL technology brings graphics rendering capabilities to web applications but exposes widely-used browsers such as Chrome - used by most of the world's internet users - to graphics-related security vulnerabilities. GraphicsFuzz has been integrated with the ClusterFuzz project for continuous fuzz testing of Chrome, and since January 2019 has found more than a dozen security vulnerabilities in Chrome's WebGL implementation that had been missed by other testing methods. These vulnerabilities have now been fixed. Impact on open source graphics drivers: Google has used GraphicsFuzz to rigorously test AMD's open-source driver for Vulkan, finding a large number of driver bugs, which have been now fixed, including bugs that turned out to affect the widely-used LLVM compiler framework. GraphicsFuzz has also been used to find defects in the open-source Mesa drivers for Intel and NVIDIA GPUs. Tools for the Vulkan ecosystem: GraphicsFuzz has found more than 100 bugs collectively in a suite of tools that underpin the Vulkan graphics programming model: the glslang shader translator, the spirv-opt and spirv-val optimizer and validator that ship as part of the SPIRV-Tools framework, and the spirv-cross cross-compiler. Integration with Vulkan CTS: Every Vulkan driver must pass the Vulkan Conformance Test Suite (CTS), which now features a new graphicsfuzz test group, comprising tests generated by GraphicsFuzz that exposed bugs in previously-conformant Vulkan drivers. Adding these tests to the Vulkan CTS requires fixes to any defective drivers in order for them to remain conformant. The Vulkan CTS now features more than 300 GraphicsFuzz tests. |
First Year Of Impact | 2017 |
Sector | Digital/Communication/Information Technologies (including Software) |
Impact Types | Societal,Economic |
Description | Chrome University Research Program |
Amount | $172,000 (USD) |
Organisation | |
Sector | Private |
Country | United States |
Start | 01/2018 |
End | 12/2018 |
Description | EPSRC Progamme Grant |
Amount | £6,100,000 (GBP) |
Funding ID | EP/R006865/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2018 |
End | 12/2023 |
Description | EPSRC Standard Mode |
Amount | £672,083 (GBP) |
Funding ID | EP/R011605/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2018 |
End | 01/2021 |
Description | ICURe Program, Phase 1 |
Amount | £62,165 (GBP) |
Organisation | Innovate UK |
Sector | Public |
Country | United Kingdom |
Start | 01/2017 |
End | 10/2017 |
Description | UK Research Institute in Verified Trustworthy Software Systems |
Amount | £102,847 (GBP) |
Organisation | National Cyber Security Centre |
Sector | Public |
Country | United Kingdom |
Start | 07/2017 |
End | 03/2018 |
Title | Framework for compiler testing |
Description | A framework for applying metamorphic testing to graphics compilers. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | I have founded a spinout company, GraphicsFuzz, based on this innovation. |
URL | http://www.graphicsfuzz.com/ |
Description | |
Organisation | |
Country | United States |
Sector | Private |
PI Contribution | We are designing methods for automated testing of the graphics rendering components of web browsers, including Chrome. |
Collaborator Contribution | Google are funding the work, and are providing us with feedback on the effectiveness of our techniques. |
Impact | The work has led to numerous Chrome bug reports, e.g.: https://bugs.chromium.org/p/chromium/issues/detail?id=806201 |
Start Year | 2018 |
Description | RTWH Aachen |
Organisation | RWTH Aachen University |
Country | Germany |
Sector | Academic/University |
PI Contribution | We have had a fruitful collaboration with RTWH Aachen University in relation to floating point support for sybmbolic execution. |
Collaborator Contribution | Our partners implemented a tool, and we were able to write a paper presenting a detailed comparison of their tool with a competing tool developed at Imperial. |
Impact | We published a joint paper (https://doi.org/10.1109/ASE.2017.8115670), which won Best Experience Report award at the ASE 2017 conference. |
Start Year | 2016 |
Title | GLFuzz: Automated testing for graphics drivers |
Description | An automated method for testing graphics shader compilers. |
IP Reference | |
Protection | Protection not required |
Year Protection Granted | 2017 |
Licensed | Yes |
Impact | None yet. |
Title | GraphicsFuzz |
Description | An automated testing tool for graphics drivers. |
Type Of Technology | Software |
Year Produced | 2017 |
Open Source License? | Yes |
Impact | The software has led to the formation of GraphicsFuzz, a spinout company, which was subsequently bought by Google. |
URL | http://github.com/google/graphicsfuzz |
Company Name | GraphicsFuzz Ltd. |
Description | GraphicsFuzz finds graphics driver bugs and helps to quickly fix them. |
Year Established | 2017 |
Impact | The company's technology has found bugs in numerous GPU drivers, which has led to conversations about licensing of the technology. GraphicsFuzz Ltd. was acquired by Google LLC in 2018. The directors of GraphicsFuzz now work as Google UK employees. |
Website | http://www.graphicsfuzz.com/ |
Description | A series of widely-read blog posts |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | A series of technical blog posts designed to promote the activities of my fellowship related to graphics driver fuzzing to a broad industrial audience. |
Year(s) Of Engagement Activity | 2016,2017,2018 |
URL | https://medium.com/@afd_icl/crashes-hangs-and-crazy-images-by-adding-zero-689d15ce922b |