REDONDA: A Next-Generation State-Machine Replication Protocol for Blockchain

Lead Research Organisation: University of Surrey

Department Name: Computing Science

Abstract

The REDONDA project's ambition is to design a next-generation replication protocol for blockchain. To achieve this, the project taps into recent advances in networking, secure computing and distributed systems. At the scale of a datacenter, the protocol relies on two recent technologies: RDMA and TEE. Both technologies are leveraged to create a sub-microsecond consensus layer that tolerates Byzantine failures. TEEs are also used in a novel upgradable and portable smart contract engine to execute blockchain transactions across a variety of infrastructures and hardware. Between data centres, the protocol relies on leaderless state-machine replication. This recent approach decomposes transaction ordering into two sub-tasks that can execute in parallel, without a central coordinator to bottleneck the system. To ensure security and safety at runtime, the REDONDA project creates the blockchain protocol by composing mechanically-verified building blocks. The new blockchain protocol is assessed using real hardware against benchmarks and publicly available traces. We target that it scales across hundreds of geo-distributed nodes while offering 100k+ transactions per second and split-second latency.

Funded Value:

£140,338

Funded Period:

Mar 24 - Apr 27

Funder:

EPSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

EP/Y036425/1

Principal Investigator:

Gregory Chockler

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Computer Sys. & Architecture (50%)

Information & Knowledge Mgmt (50%)

Organisations

People	ORCID iD
Gregory Chockler (Principal Investigator)
Brijesh Dongol (Co-Investigator)	http://orcid.org/0000-0003-0446-3507

Publications

Author Name

Title Publication Date Published

10 25 50

Bravo M (2024) Liveness and latency of Byzantine state-machine replication in Distributed Computing

Bravo M (2024) Vertical Atomic Broadcast and Passive Replication

Castañeda A (2024) What Cannot Be Implemented on Weak Memory?

Egorov S. (2024) Mangosteen: Fast Transparent Durability for Linearizable Applications using NVM

Vafeiadi Bila E (2024) A verified durable transactional mutex lock for persistent x86-TSO in Formal Methods in System Design

Further Funding
Collaboration
Intellectual Property
Software and Technical Products


Description	Timing-Based Concurrent Algorithms for Weak Memory Models
Amount	£12,000 (GBP)
Funding ID	IES\R1\221226
Organisation	The Royal Society
Sector	Charity/Non Profit
Country	United Kingdom
Start	07/2022
End	08/2025


Description	Byzantine Fault-Tolerant Computing and Blockchains
Organisation	IMDEA Software
Country	Spain
Sector	Public
PI Contribution	My PhD student Sadegh Keshavarzi and myself have obtained several fundamental results capturing the impact of trusted execution environments (TEE) on failure resilience of distributed algorithms solving several fundamental problems, such as, reliable storage, blockchain consensus, and state-machine replication in the presence of Byzantine faults. Prof Brijesh Dongol has contributed his expertise in formal methods to produce correctness proofs for the algorithms designed by Sadegh and myself.
Collaborator Contribution	IMDEA Software (Prof Alexey Gotsman) are contributing expertise in fault-tolerant distributed algorithms and verification. Royal Holloway (Dr Dan O'Keeffe) are contributing expertise on secure hardware technologies, systems engineering and trusted execution environments.
Impact	2024, 2025 papers co-authored by Alexey Gotsman and Dan O'Keeffe
Start Year	2024


Description	Byzantine Fault-Tolerant Computing and Blockchains
Organisation	Royal Holloway, University of London
Department	Department of Computer Science
Country	United Kingdom
Sector	Academic/University
PI Contribution	My PhD student Sadegh Keshavarzi and myself have obtained several fundamental results capturing the impact of trusted execution environments (TEE) on failure resilience of distributed algorithms solving several fundamental problems, such as, reliable storage, blockchain consensus, and state-machine replication in the presence of Byzantine faults. Prof Brijesh Dongol has contributed his expertise in formal methods to produce correctness proofs for the algorithms designed by Sadegh and myself.
Collaborator Contribution	IMDEA Software (Prof Alexey Gotsman) are contributing expertise in fault-tolerant distributed algorithms and verification. Royal Holloway (Dr Dan O'Keeffe) are contributing expertise on secure hardware technologies, systems engineering and trusted execution environments.
Impact	2024, 2025 papers co-authored by Alexey Gotsman and Dan O'Keeffe
Start Year	2024


Description	Computability in Weak Memory Models
Organisation	National Autonomous University of Mexico
Country	Mexico
Sector	Academic/University
PI Contribution	Brijesh Dongol and myself are contributing expertise in distributed algorithms, shared memory, concurrent data structure and formal methods.
Collaborator Contribution	Tel-Aviv University (Prof Ori Lahav) is contributing expertise on weak memory models and formal frameworks for their modelling. Prof Armando Castañeda (UNAM) is contributing expertise on lower bounds for shared memory and shared memory algorithms.
Impact	Paper published in the proceedings of DISC'24, a premier conference in distributed computing theory.
Start Year	2024


Description	Computability in Weak Memory Models
Organisation	Tel Aviv University
Department	School of Computer Science
Country	Israel
Sector	Academic/University
PI Contribution	Brijesh Dongol and myself are contributing expertise in distributed algorithms, shared memory, concurrent data structure and formal methods.
Collaborator Contribution	Tel-Aviv University (Prof Ori Lahav) is contributing expertise on weak memory models and formal frameworks for their modelling. Prof Armando Castañeda (UNAM) is contributing expertise on lower bounds for shared memory and shared memory algorithms.
Impact	Paper published in the proceedings of DISC'24, a premier conference in distributed computing theory.
Start Year	2024


Description	Learning Caching Strategies for Dynamic Workloads on Graph Databases
Organisation	Neo4j
Country	United States
Sector	Private
PI Contribution	We are investigating strategies to improve throughput of mixed read-write workloads on graph databases by improving page cache performance, taking into account the queries, updates, and underlying graph topology.
Collaborator Contribution	Neo4j (Dr Jim Webber) is contributing an expert knowledge in graph databases and provides inputs on requirements and research questions. Royal Holloway (Dr Dan O'Keeffe) is contributing expertise on relevant operating systems concepts, caching algorithms, and performance evaluation.
Impact	Preliminary problem statement and a PhD research proposal.
Start Year	2024


Description	Learning Caching Strategies for Dynamic Workloads on Graph Databases
Organisation	Royal Holloway, University of London
Department	Department of Computer Science
Country	United Kingdom
Sector	Academic/University
PI Contribution	We are investigating strategies to improve throughput of mixed read-write workloads on graph databases by improving page cache performance, taking into account the queries, updates, and underlying graph topology.
Collaborator Contribution	Neo4j (Dr Jim Webber) is contributing an expert knowledge in graph databases and provides inputs on requirements and research questions. Royal Holloway (Dr Dan O'Keeffe) is contributing expertise on relevant operating systems concepts, caching algorithms, and performance evaluation.
Impact	Preliminary problem statement and a PhD research proposal.
Start Year	2024


Description	Transparent Durability for In-Memory Data Stores
Organisation	Royal Holloway, University of London
Department	Department of Computer Science
Country	United Kingdom
Sector	Academic/University
PI Contribution	My PhD student Sergey Egorov, who's also incoming PDRA on the Redonda project, myself, and Prof Brijesh Dongol are working on designing, implementing and verifying Mangosteen, a novel framework that enables developers to transform a volatile in-memory data store into a durable version on NVM with minimal programming effort.
Collaborator Contribution	Dr Dan O'Keeffe (Royal Holloway) is contributing expertise in systems design and performance evaluation.
Impact	USENIC ATC 24 paper, patent application submitted, software artefact published
Start Year	2024


Title	Mangosteen -- System for Application Durability
Description	The disclosed artifact (Mangosteen) is a high-level programming framework that allows developers to transparently transform an existing in-memory applications to a corresponding durable (recoverable) version using Non-Volatile Memory (NVM), such as Intel Optane. Our framework's Application Programming Interface (API) consists of a set of callback hooks that interpose on an application's request processing flow with minimal developer effort. Mangosteen executes client operations on Dynamic Random Access Memory (DRAM) and persists their effects using binary instrumentation and redo logging. Mangosteen's concurrency control facilitates batching of read-write requests to minimize the cost of persistence, while allowing read-only requests to execute concurrently. A novel intra-batch deduplication mechanism further reduces persistence overheads for common Online Transaction Processing (OLTP) workloads. Our empirical evaluation results show that Mangosteen-enabled applications outperform state-of-the-art solutions across the entire spectrum of read-write ratios. In particular, the Mangosteen-based version of Redis demonstrates throughput gains of between 2x-5x in comparison to prior work.
IP Reference
Protection	Patent / Patent application
Year Protection Granted
Licensed	No


Title	Mangosteen
Description	This is an artefact for Mangosteen, a software layer to transparently transform an existing in-memory data store into a concurrent and persistent one on NVRAM. For more details, please consult the USENIX ATC'24 paper: Sergey Egorov, Gregory Chockler, Brijesh Dongol, Dan O'Keeffe, and Sadegh Keshavarzi, Mangosteen: Fast Transparent Durability for Linearizable Applications using NVM. To compile and run the artefact, follow the instructions in the included README.txt file.
Type Of Technology	Software
Year Produced	2024
Impact	The disclosed artifact (Mangosteen) is a high-level programming framework that allows developers to transparently transform an existing in-memory applications to a corresponding durable (recoverable) version using Non-Volatile Memory (NVM), such as Intel Optane. Our framework's Application Programming Interface (API) consists of a set of callback hooks that interpose on an application's request processing flow with minimal developer effort. Mangosteen executes client operations on Dynamic Random Access Memory (DRAM) and persists their effects using binary instrumentation and redo logging. Mangosteen's concurrency control facilitates batching of read-write requests to minimize the cost of persistence, while allowing read-only requests to execute concurrently. A novel intra-batch deduplication mechanism further reduces persistence overheads for common Online Transaction Processing (OLTP) workloads. Our empirical evaluation results show that Mangosteen-enabled applications outperform state-of-the-art solutions across the entire spectrum of read-write ratios. In particular, the Mangosteen-based version of Redis demonstrates throughput gains of between 2x-5x in comparison to prior work. Commercial applications: We have already shown that Mangosteen can be used to efficiently transform Redis (https://redis.io/), which is an open-source, in-memory data store used by millions of developers and some of the largest tech companies (e.g., Twitter, GitHub, Snapchat, StackOverflow), see https://techstacks.io/tech/redis. Any company that uses Redis can benefit from the data persistence and recoverability offered by Mangosteen. In particular, this would allow these companies to reduce the number of replicas in a distributed deployment, and hence reduce costs.
URL	https://zenodo.org/doi/10.5281/zenodo.11390431

Abstract

Organisations

People

ORCID iD

Publications