Distributed Heterogeneous Vertically IntegrateD ENergy Efficient Data centres
Lead Research Organisation:
University of Edinburgh
Department Name: Sch of Informatics
Abstract
Our world is in the midst of a "big data" revolution, driven by the ubiquitous ability to gather, analyse, and query datasets of unprecedented variety and size. The sheer storage volume and processing capacity required to manage these datasets has resulted in a transition away from desktop processing and toward warehouse-scale computing inside data centres. State-of-the-art data centres, employed by the likes of Google and Facebook, draw 20-30 MW of power, equivalent to 20,000 homes, with these companies needing many data centres each. The global data centre energy footprint is estimated at around 2% of the world's energy consumption and doubles every five years [33, 34]. Contemporary data centres have an average overhead of 90% [32], meaning that they consume up to 1.9 MW to deliver 1 MW of IT support; this is not cost-effective or environmentally sound. If the exponential data growth and processing capacity are to scale in the way that both the public and industry have come to rely upon, we must tackle the data centre energy crisis or face the reality of stagnated progress. With the semiconductor industry's inability to further lower operating voltages in processor and memory chips, the challenge is in developing technologies for large-scale data-centric computation with energy as a first-order design constraint.
The DIVIDEND project attacks the data centre energy efficiency bottleneck through vertical integration, specialisation, and cross-layer optimisation. Our vision is to present heterogeneous data centres, combining CPUs, GPUs, and task-specific accelerators, as a unified entity to the application developer and let the runtime optimise the utilisation of the system resources during task execution. DIVIDEND embraces heterogeneity to dramatically lower the energy per task through extensive hardware specialisation while maintaining the ease of programmability of a homogeneous architecture. To lower communication latency and energy, DIVIDEND leverages SoC integration and prefers a lean point-to-point messaging fabric over complex connection-oriented network protocols. DIVIDEND addresses the programmability challenge by adapting and extending the industry-led heterogeneous systems architecture programming language and runtime initiative to account for energy awareness and data movement. DIVIDEND provides for a cross-layer energy optimisation framework via a set of APIs for energy accounting and feedback between hardware, compilation, runtime, and application layers. The DIVIDEND project will usher in a new class of vertically integrated data centres and will take a first stab at resolving the energy crisis by improving the power usage effectiveness of data centres by at least 50%.
The DIVIDEND project attacks the data centre energy efficiency bottleneck through vertical integration, specialisation, and cross-layer optimisation. Our vision is to present heterogeneous data centres, combining CPUs, GPUs, and task-specific accelerators, as a unified entity to the application developer and let the runtime optimise the utilisation of the system resources during task execution. DIVIDEND embraces heterogeneity to dramatically lower the energy per task through extensive hardware specialisation while maintaining the ease of programmability of a homogeneous architecture. To lower communication latency and energy, DIVIDEND leverages SoC integration and prefers a lean point-to-point messaging fabric over complex connection-oriented network protocols. DIVIDEND addresses the programmability challenge by adapting and extending the industry-led heterogeneous systems architecture programming language and runtime initiative to account for energy awareness and data movement. DIVIDEND provides for a cross-layer energy optimisation framework via a set of APIs for energy accounting and feedback between hardware, compilation, runtime, and application layers. The DIVIDEND project will usher in a new class of vertically integrated data centres and will take a first stab at resolving the energy crisis by improving the power usage effectiveness of data centres by at least 50%.
Planned Impact
DIVIDEND aims at a paradigm shift from throughput-oriented to energy-oriented parallel computing. The performance of computing systems is already defined by the available power, yet most layers of the HW/SW stack of modern systems are optimised without power considerations. DIVIDEND will develop methods and tools for energy-aware optimisation throughout the HW and SW stack.
Parallel programming already faces major challenges in dealing with the increasing diversity and heterogeneity of parallel architectures. These challenges will only grow larger as energy will become the main limitation in future architectures. DIVIDEND involves the programmer in energy management, through the HSA programming language that assist in energy conservation by parallel pro- grams. At the same time, DIVIDEND implements a high-productivity system software tool-chain which effectively supports heterogeneous data centre environments.
DIVIDEND makes energy a first-class citizen in computing systems, by developing a HW/SW environment that manages energy as a resource that is equally critical as other resources, such as processor time and memory space. DIVIDEND also advocates a fundamental departure from the current fragmented models of energy management in computing systems. The DIVIDEND HW/SW environment develops holistic energy optimisation by controlling a multitude of tuneable HW and SW parameters and leveraging dynamic workload properties.
All academic partners of DIVIDEND are active and prolific members of the HiPEAC European Network of Excellence. One of the project's principal investigators is also a contributor of the 2013 HiPEAC Roadmap (http://www.hipeac.net/system/files/hipeac_roadmap1_0.pdf). DIVIDEND responds to the three major future challenges identified by the HiPEAC Roadmap: (a) Data Center Computing, "we must develop the capabilities to process 'big data' without increasing cost or energy"; (b) Energy efficiency,"to enable power efficient systems we must address the challenges of program- ming parallel heterogeneous processors and optimizing data movement"; and (c)System Complexity, "we need to develop tools and techniques to optimize for performance and ensure correct operation, while operating 'at-scale'". DIVIDEND also responds to the societal challenges of environmental protection and productivity: Computing systems are a part of the growing energy problem, consuming hundreds of GigaJoules of energy annually (about as much as civil aviation). Computing systems are also an essential ingredient of productivity in all aspects of human economic activity.
Parallel programming already faces major challenges in dealing with the increasing diversity and heterogeneity of parallel architectures. These challenges will only grow larger as energy will become the main limitation in future architectures. DIVIDEND involves the programmer in energy management, through the HSA programming language that assist in energy conservation by parallel pro- grams. At the same time, DIVIDEND implements a high-productivity system software tool-chain which effectively supports heterogeneous data centre environments.
DIVIDEND makes energy a first-class citizen in computing systems, by developing a HW/SW environment that manages energy as a resource that is equally critical as other resources, such as processor time and memory space. DIVIDEND also advocates a fundamental departure from the current fragmented models of energy management in computing systems. The DIVIDEND HW/SW environment develops holistic energy optimisation by controlling a multitude of tuneable HW and SW parameters and leveraging dynamic workload properties.
All academic partners of DIVIDEND are active and prolific members of the HiPEAC European Network of Excellence. One of the project's principal investigators is also a contributor of the 2013 HiPEAC Roadmap (http://www.hipeac.net/system/files/hipeac_roadmap1_0.pdf). DIVIDEND responds to the three major future challenges identified by the HiPEAC Roadmap: (a) Data Center Computing, "we must develop the capabilities to process 'big data' without increasing cost or energy"; (b) Energy efficiency,"to enable power efficient systems we must address the challenges of program- ming parallel heterogeneous processors and optimizing data movement"; and (c)System Complexity, "we need to develop tools and techniques to optimize for performance and ensure correct operation, while operating 'at-scale'". DIVIDEND also responds to the societal challenges of environmental protection and productivity: Computing systems are a part of the growing energy problem, consuming hundreds of GigaJoules of energy annually (about as much as civil aviation). Computing systems are also an essential ingredient of productivity in all aspects of human economic activity.
Organisations
Publications
Cummins C
(2017)
Synthesizing benchmarks for predictive modeling
Cummins C
(2016)
Autotuning OpenCL Workgroup Size for Stencil Patterns
Cummins C
(2017)
Towards Collaborative Performance Tuning of Algorithmic Skeletons
Cummins C
(2017)
Synthesizing Benchmarks for Predictive Modeling
Mpeis P
(2016)
Iterative Compilation on Mobile Devices
Mukhanov L
(2017)
ALEA A Fine-Grained Energy Profiling Tool
in ACM Transactions on Architecture and Code Optimization
Ogilvie W
(2017)
Minimizing the cost of iterative compilation with active learning
Petoumenos P
(2015)
Power Capping: What Works, What Does Not
Rocha, R.
(2019)
Function Merging by Sequence Alignment
Description | Smart compiler technology can reduce power consumption in mobile and data centre settings |
Exploitation Route | Larger scale investigation in cloud |
Sectors | Digital/Communication/Information Technologies (including Software) |
URL | http://dividend.gforge.inria.fr/ |
Title | CLGen, Deep learning based program generator for OpenCL |
Description | CLGen uses deep learning to generate human like programs for further machine learning and compiler fuzzing. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | Ongoing. |
URL | https://github.com/ChrisCummins/clgen |