DART: Design Accelerators by Regulating Transformations
Lead Research Organisation:
Imperial College London
Department Name: Computing
Abstract
The DART project aims to pioneer a ground-breaking capability to enhance the performance and energy efficiency of reconfigurable hardware accelerators for next-generation computing systems. This capability will be achieved by a novel foundation for a transformation engine based on heterogeneous graphs for design optimisation and diagnosis. While hardware designers are familiar with transformations by Boolean algebra, the proposed research promotes a design-by-transformation style by providing, for the first time, tools which facilitate experimentation with design transformations and their regulation by meta-programming. These tools will cover design space exploration based on machine learning, and end-to-end tool chains mapping designs captured in multiple source languages to heterogeneous reconfigurable devices targeting cloud computing, Internet-of-Things and supercomputing. The proposed approach will be evaluated through a variety of benchmarks involving hardware acceleration, and through codifying strategies for automating the search of neural architectures for hardware implementation with both high accuracy and high efficiency.
Organisations
- Imperial College London (Lead Research Organisation)
- Tianjin University (Project Partner)
- Deloitte UK (Project Partner)
- Corerain Technologies (Project Partner)
- Intel Corporation (UK) Ltd (Project Partner)
- Xilinx Corp (Project Partner)
- Stanford University (Project Partner)
- Microsoft Research Limited (Project Partner)
- RIKEN (Project Partner)
- Maxeler Technologies Ltd (Project Partner)
- University of British Columbia (Project Partner)
- Cornell University (Project Partner)
- Dunnhumby (Project Partner)
People |
ORCID iD |
| Wayne Luk (Principal Investigator) |
Publications
Denholm S
(2023)
Customisable Processing of Neural Networks for FPGAs
Fan H
(2023)
High-Performance Acceleration of 2-D and 3-D CNNs on FPGAs Using Static Block Floating Point.
in IEEE transactions on neural networks and learning systems
Fan H
(2022)
FPGA-Based Acceleration for Bayesian Convolutional Neural Networks
in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Fan H
(2022)
Enabling fast uncertainty estimation
Fan H
(2022)
Accelerating Bayesian Neural Networks via Algorithmic and Hardware Optimizations
in IEEE Transactions on Parallel and Distributed Systems
| Description | DART's first objective to deliver a novel transformation engine based on heterogeneous graphs for efficient hardware design is met by three new design optimisation capabilities. First, the Meta-Programming Design-Flow Pattern approach [1] codifies reusable and application-agnostic optimization tasks into automated design flows, significantly enhancing developer productivity and hardware efficiency on CPU (Central Processing Unit) and GPU (Graph Processing Unit) platforms. Second, the Auto-Generating Diverse Heterogeneous Design approach [2] extends automation through Path Selection Automation, enabling selection of optimised hardware targets across CPUs, GPUs, and FPGAs (Field Programmable Gate Arrays) based on static and runtime analyses. Third, the MetaML approach [3] provides customizable, cross-stage optimization strategies for deep learning accelerators, automating transformation selection from high-level neural network descriptions down to low-level FPGA implementations. These three approaches show that heterogeneous graphs provide a unified representation for diverse model types, ranging from Abstract Syntax Trees for capturing high-level specifications to dataflow representations of neural networks. Such representation enables a consistent and flexible transformation engine framework for optimisations targeting diverse, heterogeneous computing platforms. DART's second objective to capture transformations for design optimisation and diagnosis across multiple source-level languages and heterogeneous resources is demonstrated by automating diverse heterogeneous design [2]. This showcases generation of optimized implementations targeting CPU, GPU, and FPGA hardware from a single, high-level application description. This automated workflow supports multiple source-level languages such as C/C++ with extensions like OpenMP, AMD HIP, and Intel oneAPI, effectively abstracting target-specific transformations and simplifying design portability. DART's third objective to investigate techniques for regulating transformations for design space exploration is met by integrating advanced machine learning techniques and regulation strategies for Deep Learning Design-Flow. Specifically, we introduce Bayesian optimisation techniques that enhance efficiency of design exploration for customisable processor architectures [4], effectively reducing design space traversal effort without compromising solution quality. DART's fourth objective to prototype an end-to-end tool chain for heterogeneous hardware targeting cloud and edge systems is demonstrated by two studies. First, a flexible and automated end-to-end toolchain which generates diverse designs for CPU, GPU, and FPGA hardware from a single high-level source [2]; the related cost and performance trade-offs are illustrated for heterogeneous cloud resources. Second, hardware-aware optimizations for deep learning inference on edge devices [5]; they automate configuring FPGA hardware resources, balancing latency, resource utilization, and throughput for efficient deployment of AI solutions on reconfigurable IoT devices. These optimisations are integrated in our MetaML toolchain [3]. DART has also led to new neural architecture search techniques and to new applications in medical imaging and high-energy physics, which will be reported in future. [1] J. Vandebon, J.G.F. Coutinho, W. Luk: Meta-Programming Design-Flow Patterns for Automating Reusable Optimisations. HEART 2022 [2] J. Vandebon, J.G.F. Coutinho, W. Luk: Auto-Generating Diverse Heterogeneous Designs. RAW 2024 [3] Z. Que et al: MetaML: Automating Customizable Cross-Stage Design-Flow for Deep Learning Acceleration. FPL 2023 [4] J.G.F. Coutinho et al: Exploring Machine Learning Adoption in Customisable Processor Design. ASICON 2023 [5] M. Rognlien et al: Wayne Luk: Hardware-Aware Optimizations for Deep Learning Inference on Edge Devices. ARC 2022 |
| Exploitation Route | The outcomes are being adopted in follow-on projects and research collaborations. First, DART can support the SONNETS programme (EP/X036006/1) which explores novel capabilities for national-level real-time financial risk analysis, involving project partners from financial institutions. Second, DART can benefit tools for the Silicon Brain Cube project (UKRI256), which investigates 3D hardware architectures and advanced AI techniques for next-generation electronic systems. Third, DART can contribute to our collaboration with the Institute of Cancer Research and CERN, which requires heterogeneous hardware resources for speeding up demanding applications in areas such as medical image processing and machine learning for high-energy physics experiments. |
| Sectors | Electronics Financial Services and Management Consultancy Healthcare Other |
| URL | https://dart.doc.ic.ac.uk/ |
| Description | Reliable and Robust Quantum Computing |
| Amount | £2,227,394 (GBP) |
| Funding ID | EP/W032635/1 |
| Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
| Sector | Public |
| Country | United Kingdom |
| Start | 03/2022 |
| End | 03/2026 |
| Description | SONNETS: Scalability Oriented Novel Network of Event Triggered Systems |
| Amount | £6,467,613 (GBP) |
| Funding ID | EP/X036006/1 |
| Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
| Sector | Public |
| Country | United Kingdom |
| Start | 01/2024 |
| End | 12/2028 |