Reliable Numerical Computation with Parallel Unreliable Technologies
Lead Research Organisation:
Imperial College London
Department Name: Electrical and Electronic Engineering
Abstract
Imagine a world where mathematical dynamical models of processes in medicine, transport, finance and energy, are routinely constructed and refined, on the fly, by personalised hardware devices. With sufficient computational power, such devices could make intelligent decisions, for example utilising advanced control techniques for real-time intervention in diabetic patients and for pacemakers or defibrilators, or to automatically re-route personalised transport options based on up-to-the-minute information. The vision is a powerful one, but highlights many limitations of today's digital technologies, limitations that will not be overcome with simple scaling of technology, but which need a fundamental rethink of the way in which massively parallel computation can be performed. The focus of this proposal is therefore not on any one of the above grand challenges, but rather on the fundamental scientific problems enabling this revolution. The proposed research could enable computation at orders of magnitude better energy/performance ratio than would be possible by extending today's techniques to tomorrow's technologies, opening up a step change in current practices at both the low power end of computation (sophisticated personal devices) and the high power end (advancements in computational physics, chemistry, biology and finance).Over the next decade, there will be a number of radical shifts to the way in which computation is performed. In particular, rather than computing with a single computational core equipped with a numerical computational unit, or with a traditional inter-processor network of such cores, two differences are becoming increasingly apparent. Firstly, if we allow process technology to scale at the rate it could, then each computational device will become increasingly unreliable. Secondly, under the same assumption, this lack of reliability will be compensated by massive parallelism. Independently, each of these issues requires considerable research effort to overcome the lack of reliability and the current inability to make efficient use of massive parallelism in a portable manner, and there is much ongoing international research in these distinct fast-moving areas.These seemingly distinct challenges, however, have an untapped common research core. Massive parallelism in numerical computation mandates a way to effectively deal with numerical imprecision in computation, even in fully reliable circuitry. One must have in mind a specification of tolerable numerical accuracy. Once such a specification exists, and can be formalised, it opens up the potential to use this specification as input to a powerful specialised optimising compiler, capable of optimising numerical hardware and software in order to achieve the specification with maximum performance within a given power envelope. Lack of reliability in future devices will propagate to numerical computation as noticeable numerical errors in the hardware. One way of overcoming these errors is to utilise parallelism in some kind of redundant fashion. A radical alternative is to `hide' the numerical errors caused by unreliable components within the tolerance specification already required. For a fixed silicon area, the emerging multi-core revolution in computational hardware brings to the fore a tension between numerical precision and computational performance; one can no longer afford to pay the price in performance for over-designed hardware.At the core of this research will therefore be the development of mathematical techniques to reason about the compilation of numerical software into parallel hardware, broadening applicability to general classes of nonlinear and control-intensive algorithms, requiring the application of nonlinear systems theory to reason about the convergence of classes of algorithm under numerical perturbation, and mechanising algebraic approaches to reasoning about accuracy in numerical algorithms.
Planned Impact
A full discussion of impact is included in the Pathways to Impact statement. 1. Knowledge - Fundamental scientific advances in analysis and optimization of numerical algorithms for massively parallel unreliable hardware. - Implications for the program analysis and numerical analysis communities in terms of new multi-disciplinary research programmes. - New tools and techniques released via the web. 2. Economy - Contributions to UK competitiveness in the area of parallel hardware architectures and tools. - Contributions to the knowledge-base of UK companies such as ARM and to other investors in the UK. - Potential for patents, licensing, and possibly spin-outs via Imperial Innovations. 3. People - Development of highly skilled researchers in shortage areas. 4. Society - Potential impact in personalised devices, healthcare, transport, modelling and simulation of pandemics, financial markets, and others.
Publications
Le Lann C
(2011)
Reconfigurable Computing: Architectures, Tools and Applications
Constantinides G
(2011)
Guest Editors' Introduction: Surveying the Landscape of FPGA Accelerator Research
in IEEE Design & Test of Computers
Bayliss S
(2011)
Reconfigurable Computing: Architectures, Tools and Applications
Constantinides G
(2011)
Numerical Data Representations for FPGA-Based Scientific Computing
in IEEE Design & Test of Computers
Manoukian M
(2011)
Reconfigurable Computing: Architectures, Tools and Applications
Boland D
(2011)
Bounding Variable Values and Round-Off Effects Using Handelman Representations
in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Boland D
(2011)
Optimizing memory bandwidth use and performance for matrix-vector multiplication in iterative methods
in ACM Transactions on Reconfigurable Technology and Systems
Longo S
(2011)
A parallel formulation for predictive control with nonuniform hold constraints
in Annual Reviews in Control
Liu Q
(2012)
Optimizing Hardware Design by Composing Utility-Directed Transformations
in IEEE Transactions on Computers
Rafique A
(2012)
Reconfigurable Computing: Architectures, Tools and Applications
Drane T
(2012)
Correctly rounded constant integer division via multiply-add
Rafique A
(2012)
Enhancing performance of Tall-Skinny QR factorization using FPGAs
Bayliss S
(2012)
Analytical synthesis of bandwidth-efficient SDRAM address generators
in Microprocessors and Microsystems
Bayliss S
(2012)
Optimizing SDRAM bandwidth for custom FPGA loop accelerators
Shahzad A
(2012)
A Stable and Efficient Method for Solving a Convex Quadratic Program with Application to Optimal Control
in SIAM Journal on Optimization
Boland D
(2012)
A scalable approach for automated precision analysis
Hasan A
(2013)
Control-Theoretic Forward Error Analysis of Iterative Numerical Algorithms
in IEEE Transactions on Automatic Control
Kan Shi
(2013)
Overclocking datapath for latency-error tradeoff
Boland D
(2013)
A Scalable Precision Analysis Framework
in IEEE Transactions on Multimedia
Guan Z
(2014)
Mitigation of process variation effect in FPGAs with partial rerouting method
in IEICE Electronics Express
Guan Z
(2014)
Classification on variation maps: a new placement strategy to alleviate process variation on FPGA
in IEICE Electronics Express
Winterstein F
(2015)
MATCHUP
Winterstein F
(2015)
Separation Logic for High-Level Synthesis
in ACM Transactions on Reconfigurable Technology and Systems
Wickerson J
(2015)
Remote-scope promotion: clarified, rectified, and verified
Jerez J
(2015)
A Low Complexity Scaling Method for the Lanczos Kernel in Fixed-Point Arithmetic
in IEEE Transactions on Computers
Hung E
(2015)
Delay-Bounded Routing for Shadow Registers
Shi K
(2015)
Imprecise Datapath Design An Overclocking Approach
in ACM Transactions on Reconfigurable Technology and Systems
Kerrigan E
(2015)
Computer architectures to close the loop in real-time optimization
Rafique A
(2015)
Communication Optimization of Iterative Sparse Matrix-Vector Multiply on GPUs and FPGAs
in IEEE Transactions on Parallel and Distributed Systems
Ramanathan N
(2016)
A Case for Work-stealing on FPGAs with OpenCL Atomics
Batty M
(2016)
Overhauling SC atomics in C11 and OpenCL
in ACM SIGPLAN Notices
Davis J
(2016)
Knowledge is Power
Batty M
(2016)
Overhauling SC atomics in C11 and OpenCL
Khusainov B
(2016)
Multi-objective Co-design for model predictive control with an FPGA
Constantinides G
(2017)
Algorithms and Arithmetic: Choose Wisely
Wickerson J
(2017)
Automatically comparing memory consistency models
in ACM SIGPLAN Notices
Description | - That iterative algorithms are amenable to automated finite precision analysis - That memory systems both containing regular and irregular accesses can be customised automatically to the underlying application - That overclocking error can be bounded an treated in a similar fashion to roundoff error - That novel data representations can improve the ability to overclock datapath |
Exploitation Route | By applying the lessons in computer architecture design By developing tools to fully automate the approaches developed |
Sectors | Digital/Communication/Information Technologies (including Software),Electronics |
Description | Research has been used by a number of companies. Imagination Technologies funded funded an RAEng Research Chair to take this forward. Intel is currently funding research at Imperial College continuing these theme of work. |
Sector | Creative Economy,Digital/Communication/Information Technologies (including Software),Electronics,Leisure Activities, including Sports, Recreation and Tourism |
Impact Types | Societal,Economic |
Description | FP7 STREP - DESYRE |
Amount | £170,209 (GBP) |
Funding ID | DESYRE |
Organisation | European Commission |
Sector | Public |
Country | European Union (EU) |
Start | 10/2011 |
End | 01/2015 |
Description | RAEng Research Chair |
Amount | £215,536 (GBP) |
Organisation | Royal Academy of Engineering |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 01/2015 |
End | 12/2019 |
Description | RAEng Research Chair (Industry) |
Amount | £342,958 (GBP) |
Organisation | Imagination Technologies |
Sector | Private |
Country | United Kingdom |
Start | 01/2015 |
End | 12/2019 |
Description | ARM - ChallEng |
Organisation | Arm Limited |
Country | United Kingdom |
Sector | Private |
PI Contribution | None to date |
Collaborator Contribution | None to date |
Impact | None to date |
Start Year | 2011 |
Description | Altera - ChallEng |
Organisation | Altera |
Department | Altera Europe |
Country | United Kingdom |
Sector | Private |
PI Contribution | Research output |
Collaborator Contribution | Design tools, licences, engineer time |
Impact | Publications |
Start Year | 2011 |
Description | Microsoft - ChallEng |
Organisation | Microsoft Research |
Department | Microsoft Research Cambridge |
Country | United Kingdom |
Sector | Private |
PI Contribution | Research output |
Collaborator Contribution | Terminator |
Impact | Publications |
Start Year | 2011 |
Description | QMUL - ChallEng |
Organisation | Queen Mary University of London |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Research output |
Collaborator Contribution | Workshops |
Impact | QMUL group moved |
Start Year | 2011 |
Description | University of Toronto - Chall Eng |
Organisation | University of Toronto |
Country | Canada |
Sector | Academic/University |
PI Contribution | Staff exchange |
Collaborator Contribution | Staff exchange |
Impact | Staff exchange |
Start Year | 2011 |