Reliable Numerical Computation with Parallel Unreliable Technologies

Lead Research Organisation: Imperial College London
Department Name: Electrical and Electronic Engineering

Abstract

Imagine a world where mathematical dynamical models of processes in medicine, transport, finance and energy, are routinely constructed and refined, on the fly, by personalised hardware devices. With sufficient computational power, such devices could make intelligent decisions, for example utilising advanced control techniques for real-time intervention in diabetic patients and for pacemakers or defibrilators, or to automatically re-route personalised transport options based on up-to-the-minute information. The vision is a powerful one, but highlights many limitations of today's digital technologies, limitations that will not be overcome with simple scaling of technology, but which need a fundamental rethink of the way in which massively parallel computation can be performed. The focus of this proposal is therefore not on any one of the above grand challenges, but rather on the fundamental scientific problems enabling this revolution. The proposed research could enable computation at orders of magnitude better energy/performance ratio than would be possible by extending today's techniques to tomorrow's technologies, opening up a step change in current practices at both the low power end of computation (sophisticated personal devices) and the high power end (advancements in computational physics, chemistry, biology and finance).Over the next decade, there will be a number of radical shifts to the way in which computation is performed. In particular, rather than computing with a single computational core equipped with a numerical computational unit, or with a traditional inter-processor network of such cores, two differences are becoming increasingly apparent. Firstly, if we allow process technology to scale at the rate it could, then each computational device will become increasingly unreliable. Secondly, under the same assumption, this lack of reliability will be compensated by massive parallelism. Independently, each of these issues requires considerable research effort to overcome the lack of reliability and the current inability to make efficient use of massive parallelism in a portable manner, and there is much ongoing international research in these distinct fast-moving areas.These seemingly distinct challenges, however, have an untapped common research core. Massive parallelism in numerical computation mandates a way to effectively deal with numerical imprecision in computation, even in fully reliable circuitry. One must have in mind a specification of tolerable numerical accuracy. Once such a specification exists, and can be formalised, it opens up the potential to use this specification as input to a powerful specialised optimising compiler, capable of optimising numerical hardware and software in order to achieve the specification with maximum performance within a given power envelope. Lack of reliability in future devices will propagate to numerical computation as noticeable numerical errors in the hardware. One way of overcoming these errors is to utilise parallelism in some kind of redundant fashion. A radical alternative is to `hide' the numerical errors caused by unreliable components within the tolerance specification already required. For a fixed silicon area, the emerging multi-core revolution in computational hardware brings to the fore a tension between numerical precision and computational performance; one can no longer afford to pay the price in performance for over-designed hardware.At the core of this research will therefore be the development of mathematical techniques to reason about the compilation of numerical software into parallel hardware, broadening applicability to general classes of nonlinear and control-intensive algorithms, requiring the application of nonlinear systems theory to reason about the convergence of classes of algorithm under numerical perturbation, and mechanising algebraic approaches to reasoning about accuracy in numerical algorithms.

Planned Impact

A full discussion of impact is included in the Pathways to Impact statement. 1. Knowledge - Fundamental scientific advances in analysis and optimization of numerical algorithms for massively parallel unreliable hardware. - Implications for the program analysis and numerical analysis communities in terms of new multi-disciplinary research programmes. - New tools and techniques released via the web. 2. Economy - Contributions to UK competitiveness in the area of parallel hardware architectures and tools. - Contributions to the knowledge-base of UK companies such as ARM and to other investors in the UK. - Potential for patents, licensing, and possibly spin-outs via Imperial Innovations. 3. People - Development of highly skilled researchers in shortage areas. 4. Society - Potential impact in personalised devices, healthcare, transport, modelling and simulation of pandemics, financial markets, and others.

Publications

10 25 50

publication icon
Batty M (2016) Overhauling SC atomics in C11 and OpenCL in ACM SIGPLAN Notices

publication icon
Bayliss S (2012) Analytical synthesis of bandwidth-efficient SDRAM address generators in Microprocessors and Microsystems

publication icon
Boland D (2013) A Scalable Precision Analysis Framework in IEEE Transactions on Multimedia

publication icon
Boland D (2011) Bounding Variable Values and Round-Off Effects Using Handelman Representations in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

publication icon
Boland D (2011) Optimizing memory bandwidth use and performance for matrix-vector multiplication in iterative methods in ACM Transactions on Reconfigurable Technology and Systems

 
Description - That iterative algorithms are amenable to automated finite precision analysis

- That memory systems both containing regular and irregular accesses can be customised automatically to the underlying application

- That overclocking error can be bounded an treated in a similar fashion to roundoff error

- That novel data representations can improve the ability to overclock datapath
Exploitation Route By applying the lessons in computer architecture design

By developing tools to fully automate the approaches developed
Sectors Digital/Communication/Information Technologies (including Software),Electronics

 
Description Research has been used by a number of companies. Imagination Technologies funded funded an RAEng Research Chair to take this forward. Intel is currently funding research at Imperial College continuing these theme of work.
Sector Creative Economy,Digital/Communication/Information Technologies (including Software),Electronics,Leisure Activities, including Sports, Recreation and Tourism
Impact Types Societal,Economic

 
Description FP7 STREP - DESYRE
Amount £170,209 (GBP)
Funding ID DESYRE 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 10/2011 
End 01/2015
 
Description RAEng Research Chair
Amount £215,536 (GBP)
Organisation Royal Academy of Engineering 
Sector Charity/Non Profit
Country United Kingdom
Start 01/2015 
End 12/2019
 
Description RAEng Research Chair (Industry)
Amount £342,958 (GBP)
Organisation Imagination Technologies 
Sector Private
Country United Kingdom
Start 01/2015 
End 12/2019
 
Description ARM - ChallEng 
Organisation Arm Limited
Country United Kingdom 
Sector Private 
PI Contribution None to date
Collaborator Contribution None to date
Impact None to date
Start Year 2011
 
Description Altera - ChallEng 
Organisation Altera
Department Altera Europe
Country United Kingdom 
Sector Private 
PI Contribution Research output
Collaborator Contribution Design tools, licences, engineer time
Impact Publications
Start Year 2011
 
Description Microsoft - ChallEng 
Organisation Microsoft Research
Department Microsoft Research Cambridge
Country United Kingdom 
Sector Private 
PI Contribution Research output
Collaborator Contribution Terminator
Impact Publications
Start Year 2011
 
Description QMUL - ChallEng 
Organisation Queen Mary University of London
Country United Kingdom 
Sector Academic/University 
PI Contribution Research output
Collaborator Contribution Workshops
Impact QMUL group moved
Start Year 2011
 
Description University of Toronto - Chall Eng 
Organisation University of Toronto
Country Canada 
Sector Academic/University 
PI Contribution Staff exchange
Collaborator Contribution Staff exchange
Impact Staff exchange
Start Year 2011