Distributed Heterogeneous Vertically IntegrateD ENergy Efficient Data centres

Lead Research Organisation: Lancaster University
Department Name: Computing & Communications


Our world is in the midst of a "big data" revolution, driven by the ubiquitous ability to gather, analyse, and query datasets of unprecedented variety and size. The sheer storage volume and processing capacity required to manage these datasets has resulted in a transition away from desktop processing and toward warehouse-scale computing inside data centres. State-of-the-art data centres, employed by the likes of Google and Facebook, draw 20-30 MW of power, equivalent to 20,000 homes, with these companies needing many data centres each. The global data centre energy footprint is estimated at around 2% of the world's energy consumption and doubles every five years [33, 34]. Contemporary data centres have an average overhead of 90% [32], meaning that they consume up to 1.9 MW to deliver 1 MW of IT support; this is not cost-effective or environmentally sound. If the exponential data growth and processing capacity are to scale in the way that both the public and industry have come to rely upon, we must tackle the data centre energy crisis or face the reality of stagnated progress. With the semiconductor industry's inability to further lower operating voltages in processor and memory chips, the challenge is in developing technologies for large-scale data-centric computation with energy as a first-order design constraint.
The DIVIDEND project attacks the data centre energy efficiency bottleneck through vertical integration, specialisation, and cross-layer optimisation. Our vision is to present heterogeneous data centres, combining CPUs, GPUs, and task-specific accelerators, as a unified entity to the application developer and let the runtime optimise the utilisation of the system resources during task execution. DIVIDEND embraces heterogeneity to dramatically lower the energy per task through extensive hardware specialisation while maintaining the ease of programmability of a homogeneous architecture. To lower communication latency and energy, DIVIDEND leverages SoC integration and prefers a lean point-to-point messaging fabric over complex connection-oriented network protocols. DIVIDEND addresses the programmability challenge by adapting and extending the industry-led heterogeneous systems architecture programming language and runtime initiative to account for energy awareness and data movement. DIVIDEND provides for a cross-layer energy optimisation framework via a set of APIs for energy accounting and feedback between hardware, compilation, runtime, and application layers. The DIVIDEND project will usher in a new class of vertically integrated data centres and will take a first stab at resolving the energy crisis by improving the power usage effectiveness of data centres by at least 50%.
Description We have shown that by optimizing and scheduling the code in different ways different performance and energy trade-offs can be achieved on heterogeneous multi-core architectures. This demonstrates that compiler-based techniques can play a key role in performing energy and performance optimizations for heterogeneous multi- and many-core systems.

We also perform the first comprehensive study the effectiveness of different power capping techniques. This provides the insights to design better power and performance optimization techniques in the future. We are among the first to show that deep learning can be used to replace compiler heuristics, leading to far better performance on parallel GPGPU programs.
Exploitation Route We have released our prototyping compile tool as open source. It can be downloaded from https://github.com/zwang4/dividend.

We have also published our results in over 10 papers from which the research community can benefit from our key finding.
Sectors Digital/Communication/Information Technologies (including Software)

Description EPSRC iCASE Studentship
Amount £35,000 (GBP)
Organisation Arm Limited 
Sector Private
Country United Kingdom
Start 01/2016 
End 06/2019
Description Royal Society
Amount £12,000 (GBP)
Organisation The Royal Society 
Sector Charity/Non Profit
Country United Kingdom
Start 03/2017 
End 03/2019
Title HSA auto-tuning framework 
Description A compiler-based auto-tuning tool for HSA applications. It is the first automatic tool for tuning HAS applications. 
Type Of Material Improvements to research infrastructure 
Year Produced 2016 
Provided To Others? Yes  
Impact There are two research groups (the project partners), Albert Cohen at Inria France, and Alexandru Amaricai from Politehnica University of Timi?oara, Romaina are using our tool 
URL https://github.com/zwang4/dividend
Description Collaboration with Dionasys 
Organisation Peking University
Department School of Electronics Engineering and Computer Science
Country China 
Sector Academic/University 
PI Contribution We are collaborating on a collaboration project funded by the Royal Society. The project mines opensource repositories like github to automatically detect bugs and generate fixings. The Lancaster team contributes to the project on compiler and code analysis expertise.
Collaborator Contribution The Peking university team contributes staff time and expertise on natural language processing to the project.
Impact The project just started and no outcome were generated yet.
Start Year 2017
Description Collaboration with Peking University 
Organisation Peking University
Department School of Electronics Engineering and Computer Science
Country China 
Sector Academic/University 
PI Contribution We are working on a joint project to mine the open sourced projects from github to detect and repair bugs. We contribute our expertise on code analysis to the project.
Collaborator Contribution The collaborative partner contributes their expertise on natural language processing to the project. The partner team involves two academics and three postgraduate students.
Impact This collaborative work has led to two joint publications: (DOI: 0.18653/v1/P17-1040 and Scale Up Event Extraction Learning via Automatic Training Data Generation).
Start Year 2017
Description HSA collaboration with AMD 
Organisation Advanced Micro Devices (AMD)
Country United States 
Sector Private 
PI Contribution This work has led to a collaboration with AMD who is a main contributor of the Heterogeneous System Architecture (HSA) Foundation. We are currently working on building a compiler-based HSA auto-tuner for the LLVM HSAIL compiler developed by AMD.
Collaborator Contribution AMD has gave us access to their internal version of the HSA driver and provide technical support to their HSA architecture.
Impact This has led to a prototype HSA auto-tuner released on github: https://github.com/zwang4/dividend
Start Year 2016
Title HSA Auto-tuning tool 
Description A compiler-based auto-tuning tool for HSA applications. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact The first auto-tuning tool for HSA programs. 
URL https://github.com/zwang4/dividend
Description Computer Science Podcast 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact We have continued running CompuCast (compucast.io), a Computer Science podcast this year.
We have produced an episode on the relevant area of the grant.
Year(s) Of Engagement Activity 2016
URL http://compucast.io
Description NDSS paper 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Our research into Android Pattern Lock security has received wide media coverage. The news appeared in most UK national newspapers and was reported on by media outlets around the world to a potential audience of millions (as reported by the press office at Lancaster University)
Year(s) Of Engagement Activity 2016
URL http://www.thetimes.co.uk/edition/news/scientists-finger-security-flaw-on-smartphone-lock-dmql3hdp3