Automated holistic efficiency for next generation data centres

Lead Research Organisation: Edgetic Ltd
Department Name: Research

Abstract

Data centres (DCs) provide critical infrastructure underpinning modern economies; they hold the software and data that modern life depends on. These facilities use massive amounts of power; from a few kilowatts up to hundreds of megawatts, much of it being generated in traditional carbon producing power stations. Although there is an increasing trend towards green energy supply, the need to improve DC efficiency as a whole is critical to UK industry. To date, most of the effort has focussed on physical systems. This project will radically improve current approaches by greatly expanding on the methodologies used model and simulate servers. We will introduce a step-change to DC efficiency by creating a holistic framework that accounts for software, hardware, facility, and human behaviours and use it in training advanced intelligent agents to achieve substantial energy reductions without affecting performance.

The number and size of DCs is growing rapidly as they are the backbone of emerging technologies like IoT, 5G, AI, etc. Current DC energy usage is approximately 3% of global consumption; but could reach as high as 10% by 2030. To remain competitive in this growing sector it is vital that UK DCs keep their energy usage in check to compete with regions where green power is cheap. DC efficiency is driven by a number of factors: reducing carbon emissions, reducing costs and increasing capability where power is restricted. Edgetic is an early stage technology company aiming to improve DC efficiency via software services.

The standard measure of DC efficiency is PUE (Power Utilisation Effectiveness): a ratio of the power consumed by the whole facility to that consumed by the IT equipment. A PUE of 1 is a theoretical minimum implying energy is only used by the IT hardware; efficiency worsens as PUE increases. Focusing on PUE, the industry has prioritised improving isolated peripheral systems rather than reducing overall energy consumption. PUE improvements are slowing as peripheral, co-dependant systems reach the limits of individual optimisation; improving IT efficiency is the next research frontier. Edgetic uses predictive mathematical modes of IT behaviour to make optimising decisions for the DC. However, our current approach requires individually modelling each workload and type of server in a DC. At present this is acceptable, but in order to substantially grow the business it is vital to improve the scalability of the modelling process since every DC is unique. Every additional variation in hardware and workload substantially increases the required evaluation.

The aim of this project is to develop novel methods to speed up server evaluation, estimate behaviours of new hardware combinations and predict performance for different workloads. Uniquely, these methods will be employed in both the existing optimisation technology and provide the foundation for new artificial intelligence tools to optimise DC operation using holistic behaviour simulations. The holistic approach will allow automatic DC optimisation using new operating strategies tailored to individual DCs based on their required characteristics. This has the benefit of radically improving data centre efficiency which in turn reduces the climate impact of DCs and maintains the UK's leading position in the data centre industry.

Planned Impact

The direct outputs from the project will provide unique IP that will allow Edgetic new opportunities to expand its service and develop new products. In particular the holistic simulation tools can be developed into capacity planning tools for data centres or a design tool. The simulation also provides an opportunity to assess the theoretical optimum configuration of hardware and software in the data centre and compare against actual measurements to measure efficiency. The wind-tunnel and server profiling process will position Edgetic as a leader in server efficiency measurement and allow offering services to measure performance or recommend hardware based on expected workloads. Additionally, the data and insights gained during the progress are expected to lead to new advances in applying AI to data centres, a widely acknowledged need that has yet to be met in the industry.

The primary benefit of this project for data centre operators is the potential for reduced operating costs and delayed capital expenditure. By increasing data centre efficiency, power costs are reduced and the performance of existing hardware is better utilised resulting in an extended upgrade cycle compared to less efficient data centres. For system integrators, the project offers the opportunity to improve the efficiency of their servers, potentially introduce changes to server design or adjust default settings for more efficient operation.

For policy makers and the wider public this project's approach offers a much better way of determining the efficiency of data centres and therefore reduce their negative impacts. Using the methods proposed in the project policy makers can have new methods for measuring the efficiency an impact of a data centre, whether that data centre is already operational or is being planned. This provides them with new tools to legislate against the adverse effects data centres can have on carbon emissions or the local grid effectively. This has the obvious benefit for the wider public of reducing data centre's effect on the environment and reducing public expenditure on required infrastructure to cope with data centre demand while not adversely affected data centre performance. For the wider public, increased data centre efficiency means that data centres can have a greater processing capacity per watt, reducing the cost of processing. This in turn supports the UK's growing digital economy by making the UK data centre market more competitive. It also facilitates the expansion of new smart city technologies like 5G and the IoT providing greater benefits to the wider UK population.

Publications

10 25 50
 
Description We began the research into understanding how IT equipment is effected by data centre temperature and did some initial research into how a digital twin of a data centre could be constructed to include all aspects of the data centre.
Exploitation Route The initial finding could help in the desgin and development of future data centres as well as optimising those that are already operating. Additionally, I have built upon this initial work after transferring to another host.
Sectors Digital/Communication/Information Technologies (including Software)