Implementing new BFT Protocols
This document describes the steps to implement a new BFT protocol in ByzzBench.
A BFT Protocol implementation in ByzzBench consists of the following components:
- A protocol replica that extends
Replica<T>: these are the nodes that run the protocol and communicate with each other via messages.Tis the type of each of the entries in the commit log (this should be simplified!).- Example: PBFT-Java Replica
- The
Replicaconstructor requires areplicaId,nodeIds(set of all replica IDs in the cluster),transport(the virtualized network instance) and acommitLog(instance where all committed operations from the replica are sent to).
- A set of protocol message POJOs that implement
MessagePayload: these are the messages that are exchanged between replicas.- They just need to have a
String getType()method that returns e.g. that it is aPREPAREmessage. - Examples from PBFT-Java: Checkpoint, Commit, NewView, Phase, Prepare, PrePrepare, Reply, Request, ViewChange
- They just need to have a
Communication and Timeouts
Using the Actor model, replicas communicate with each other via a Transport instance. The Transport is a virtualized network that allows replicas to send messages to each other. The Transport is responsible for delivering messages to the correct recipient, and also for handling timeouts.
Communication between replicas is done via message-passing through the Transport. The Replica superclass exposes a few methods:
sendMessage: send a message to a recipientmulticastMessage: send a message to a set of replicasbroadcastMessage: send a message to every other replicabroadcastMessageIncludingSelf: send a message to every replica, including self
Timeouts are implemented via callbacks to a Runnable, also handled by the Transport:
setTimeout(Runnable, timeout): Creates a timeout Action oftimeoutmilliseconds, which will then invoke theRunnable. The time is currently being ignored, and is instead triggered just like any other Action by the explorationStrategy.clearAllTimeouts(): Invalidates all outstanding timeouts for the current replica.
Commit Log
Each replica has its own instance of a CommitLog: an immutable and total ordered sequence of records. This is used to check whether safety invariants of distributed consensus are broken.
Pitfalls
- Non-determinism: For reproducibility we require the removal of any non-determinism! This can manifest itself in many ways:
- Avoid interfaces and collections that do not guarantee order, such as
SetorMap: use theOrderedSetorOrderedMapinterfaces instead, andTreeSetorLinkedHashMapimplementations to ensure order. - Avoid the use of Java’s
CompletableFuture: the implementation in Java is non-deterministic, as it uses aForkJoinPoolthat can execute tasks in parallel - tasks that are submitted to theCompletableFuturecan be executed in any order, and this can lead to non-deterministic behavior. Use our ownDeterministicCompletableFuture, which will execute the tasks in the order they were submitted to it.
- Avoid interfaces and collections that do not guarantee order, such as
Components
The diagram below includes the relevant ByzzBench components for implementing a new BFT protocol.
---
title: Simulator Components
---
classDiagram
class Event {
-int eventId
}
class Transport {
}
class MessageEvent {
-String senderId
-String recipientId
-MessagePayload message
-MessageStatus status
}
class TimeoutEvent {
}
class MessageStatus {
<<Enumeration>>
QUEUED
DELIVERED
DROPPED
}
class MessagePayload {
<<Interface>>
+String getType()
}
class Replica {
<<Abstract>>
+String getType()
}
class CommitLog {
<<Abstract>>
addEntry()
}
class TotalOrderCommitLog {
addEntry()
}
class PartialOrderCommitLog {
addEntry()
}
Event <|-- MessageEvent
Event <|-- TimeoutEvent
MessageEvent -- MessageStatus
MessageEvent -- MessagePayload
CommitLog -- Replica
Transport o-- Event
Transport o-- Replica
Replica --> Event: emits, receives
CommitLog <|-- TotalOrderCommitLog
CommitLog <|-- PartialOrderCommitLog