Kyle Butt & Nash E. Foster
Pyrofex’s research team recently published a paper describing a new leaderless consensus algorithm named Casanova. The focus of that paper was to write a journal-quality description of the algorithm and prove it is safe, live, and correct. Academic papers are not for everyone, however, and so we wanted to write up a more colloquial description of Casanova for the blockchain engineering community.
We’d like to begin by giving credit and heartfelt thanks to the Pyrofex research team, which includes: Dr. Michael Stay, CTO; Kyle Butt, Staff Software Engineer; and Derek Sorensen, Research Mathematician. It’s always a privilege to work with talented people, to help create the conditions under which they can do their best work, and to see them achieve great things. Everyone at Pyrofex is proud to be associated with such a prestigious and talented group.
Over the course of 2017 and 2018, Pyrofex worked on a number of blockchain platform projects, including the RChain project and some others we can’t tell you about. Most of these projects require development of a consensus algorithm. After we finished our work with RChain, we began searching for a consensus algorithm that we felt would be both scalable, safe, and reliable in real-world conditions.
One observation our researchers made is that while double-spends and other sort of “conflicting” transactions are possible in the blockchain context, they are not typical. Therefore, we decided to look for an algorithm that achieved what we call “optimistic consensus.” That is to say, when a user sends us a transaction, we want to assume ne network will eventually confirm the transaction and do as little work as possible while confirming. This approach turned out to be quite elegant, as we’ll explain throughout the remainder of this article.
Another observation we made is that most consensus protocols in the literature have separate message passing schemes for sharing proposed values and for voting. In the large-scale blockchain context, however, this makes things quite complicated. We’ll talk about bandwidth constraints later, but the desire to “pipeline” transactions and voting into a single block production scheme simplifies not only the protocol, but its analysis.
A third observation is that if we can prove that two transactions cannot conflict with each other, then we don’t need to decide on an order for them. In general, we don’t need a total order, but merely a partial order on non-conflicting transactions.
We believe Casanova represents a new and elegant approach to the consensus problem. We view it as highly suitable to large-scale blockchain environments, particularly for proof-of-stake or exogenously permissioned chains.
Understanding the Network
Casanova was designed knowing that FLP impossibility is a real problem for fully asynchronous Byzantine networks. Casanova therefore imposes a mild synchrony constraint upon the network: messages get delivered in some bounded amount of time. We don’t have to know the bound or even have an estimate of it, but it must exist for Casanova to be a live network. If a network partition results in all partitions lacking a fault-tolerant majority, no consensus can be found. In conditions where network delay grows unboundedly, no consensus can be found.
It is the responsibility of network node operators to provide an environment in which such failures are rare and resolvable in reasonable time. Part of the elegance of Casanova is that the requirements for systems engineers are very clear, yet very loose. In practice, we believe it will be straightforward to build and maintain a reliable, fast, and secure network.
Casanova was designed with proof-of-stake blockchains in mind, so the primary nodes on the network are referred to as validators or as clients. Proof-of-work blockchains typically refer to their primary nodes as “miners,” but Casanova lacks a proof-of-work mining function and the term validator has become industry standard. It’s also possible to run full nodes, which receive blocks but do not create them. In this way, you can keep track of the blockchain’s state without participating in creating that state.
Production blockchains seeking to achieve global scale will need a richer list of network participants. In proof-of-work contexts, the key to security and reliability is related to hashing power, but in Casanova the key parameter is the bandwidth, throughput, and to a lesser degree the computation available on the network. Throughout this article, we are focused primarily on the theory underlying Casanova and less on the specific details of any blockchain. We will mention some details when relevant, but in practice a good blockchain will specify significantly more topological and networking constraints than we do here.
One of the most important aspects of a proof-of-stake blockchain is the mechanism by which blocks are produced. Validators that produce blocks too often can choke the network. Validators that don’t produce blocks often enough may not be able to claim enough block rewards to cover their operational costs.
In Casanova, validators timestamp the blocks they create, which are produced at fixed wall-clock intervals. Consider a network with V validators. Every D time units, a validator is allowed to produce a block. Each block has a maximum size of U. Validators and full nodes must have bandwidth, B, satisfying the following:
B ≥ (3VU)/D
Normal block production will consume bandwidth at least equal to (VU)/D. We reserve another (VU)/D worth of bandwidth for partition recovery. The third (VU)/D is reserved to allow for some fluctuation in the size of the network, the amount of consensus finding, and transaction flow.
Blocks produced by validators are relatively straightforward. They contain the following data:
- Block Header:
- Parent hashes
- Transaction Hash
- Additional Metadata
- Validator Signature
- Transaction Data
Block headers consist of two parts: (1) the metadata and (2) the validator’s signature. The timestamp, parent hashes, votes, a hash of the transaction data, and any additional metadata are stored in the metadata portion of the header. The validator then signs the metadata portion with its private key and adds that to the block header. To determine the identity of a block, we define its “block hash” to be the hash of the entire block header. For subsequent blocks, these block hashes are included in the parent-hashes metadata. It’s expected that transactions will consume the majority of the block, so this scheme allows a validator to supply proof of confirmation with a much smaller amount of data.
This block structure implies a directed acyclic graph (DAG) of blocks. We sometimes refer to this as simply the “DAG” or the “blockdag.”
Validators are required to produce blocks that are totally ordered; since a validator knows its last block, it should always produce its next block as a descendant of that block. In the proof-of-stake contexts, failure to do so would be a form of equivocation and would typically be a slashing offense. Note that this creates an operational responsibility for node operations in a proof-of-stake network: validators must reliably store their blocks before sending them to the network and must be aware of previously produced blocks across node restarts and server crashes.
It is worth noting that in Casanova, it is possible to include a parent block, even when that block includes conflicts on which consensus must later be reached. You can think of this as a sort of “line item veto” of transactions within a block. Casanova is unique in its ability to accept and finalize non-conflicting transactions, while spawning consensus-finding on those that require it. This gets back to one of our original design goals–optimistic consensus—in which we try to do the least work needed to ensure that transactions clear.
Because Casanova forms consensus only on the resolution of conflicting transactions, we gain an additional reliability benefit for users that do not attempt to double-spend or otherwise spam the network. Honest users’ transactions will nearly always confirm at the maximum rate the network can achieve. When you submit conflicting transactions, then you will experience delays associated with consensus finding, but these delays should not substantially affect anyone else on the network.
The key to any successful consensus protocol is handling conflicts. In the blockchain context, transactions can’t be easily forged or replayed, so there is little risk associated with validators producing invalid transactions on their own and creating conflicts in that way. The primary type of conflict is when a single user produces transactions that cannot all be processed and, as a result, an order must be selected so that some can confirm and others can produce errors. In the trivial case, this is merely selecting between one of two transactions. In a more complex case, the blockchain may need to select an order among several possible transaction orderings.
There are some additional rules for validator behavior. Validators are not allowed to include a transaction in any block that conflicts with a transaction in an ancestor block. Validators are also disallowed from producing a block that contains two transactions that themselves conflict. In short, validators must maintain consistency of the blockdag as they produce new blocks. Conflicts can only occur between two blocks produced by different validators.
In the proof-of-stake contexts, failure to obey these rules might be referred to as a slashing condition: the validator’s stake could be eliminated and they could be ejected from the validator pool. In other permissioned blockchain contexts, there may be other sanctions or none at all. Each blockdag must decide how to handle these error conditions based on its own endogenous economic factors.
Casanova’s blockdag achieves consensus in part because everyone can see what you knew about when you produced your last block. To wit, “I know what you knew when you wrote this.” In some parts of the consensus finding process, validators need to know more than this. They also need to know that every validator made the correct choice, given what it knew at the time. As a result, Casanova requires the blockdag to specify a deterministic rule for selecting among the conflicting transactions.
In the proof-of-stake context, we believe transactions should be ordered in descending order by transaction fee. In this way, the validators extract the maximum payment from the user producing double-spends. However, all the transactions may have equal fees, in which case validators should order the transactions based on their hashes.
When conflicts occur, the goal is to get the entire network to agree in a reasonable amount of time. Agreeing represents the correctness part of the blockchain, since no two nodes can become confused about which value was selected. Agreeing in reasonable time represents the liveness condition of the network, since the agreement will eventually occur even under various network failure conditions.
In order to reach agreement, validators must calculate a score for each of the transactions based on what the validator has seen from everyone else. Casanova uses a scoring method based on the ancestor-or-self relation, where scores are calculated based on how many validators have previously accepted a transaction. Validators are looking to establish that a fault-tolerant majority (FTM) of all validators have observed and accepted a transaction.
There are a couple of items of interest related to scoring. First of all, in the proof-of-stake context, scoring may be performed using validator weights that are unequal. Weights may be assigned to validators under a variety of schemes, but one scheme that’s commonly proposed is to score based on the amount of stake each validator has bonded to the network. In other permissioned contexts, scoring weights may be assigned in other ways. The primary restriction is that weights must be objective and observable to the entire network all the time. This restriction means that changing your weight in any way must be a transaction that is itself finalized in the usual way.
Second of all, Casanova’s validators do not have to be concerned with transactions that conflict with previously finalized transactions. Once a transaction has an FTM behind it, that transaction is confirmed and cannot be rolled back. Any future transaction that conflicts with it may be treated as an error and rejected immediately upon submission. In this way, Casanova maintains a relatively small number of transactions over which conflicts are even possible and this helps to ensure that consensus finding work is minimized.
One of the advantages we had with Casanova is that we were searching for a good consensus algorithm and could be flexible about the transaction model it implied. We knew that good consensus algorithms exist for reasonable transaction models. After all, BitCoin is an existence proof. We simply needed to find one and ensure the transaction model was acceptable.
Many projects do not have this luxury and, as a result, they have a much more difficult job to accomplish. They have to find a consensus algorithm that works for a transaction model that’s already defined. It’s notable that Casanova’s transaction model is not suitable for use with either the Ethereum EVM or RChain’s Rholang.
This is because Casanova requires transactions be partitioned into disjoint transaction domains. If you know a little mathematics, you can view this partition as an equivalence relation on the set of unfinalized transactions. In particular, this requirement enforces the following rule:
- If transactions A and B conflict, and
- If transactions B and C conflict
- Then, transactions A and C also conflict.
Under both the EVM and Rholang, there are situations in which this is not the case. This results in the validators being forced to perform an expensive search across all transactions looking for subsets of transactions that result in a conflict. In our view, this is impractically expensive, and will make consensus finding on those networks slower, more expensive, and more difficult to get correct.
It is extremely important for Casanova that transaction domains be easily identifiable. The equivalence relation on the set of outstanding transactions must be simple to compute. In the case of a UTXO ledger, this is straightforward. UTXOs each belong to an account and spends must come from the same account. Spends from each account must be totally ordered. This means that non-comparable transactions from a single account are considered conflicts.
For example, a blockchain might choose to require users to number their transactions in a monotonically increasing way, so that the user supplies a total ordering of their own transactions. Conflicts might still arrive, since users could send different transactions with the same number to different validators. But, in this case consensus finding would choose one and the network would reject the other, thus preserving the total order for the user’s account.
There are also computational models that fit Casanova’s transaction model and would enable the execution of smart contracts on a blockchain using our algorithm. Those models of computation do not look like running a single serial VM, however, as computation within a transaction domain must be able to proceed independently of other transactions domains. Instead, we have to choose models with multiple processes executing in parallel, like the communicating event loop model that NodeJS uses. Moreover, it must be straightforward to compute when two threads of execution are in the same transaction domain.
Once a conflict is detected, validators must begin a process of voting to achieve consensus. This is a typical approach for traditional consensus protocols, but somewhat unique in the blockchain context. In Casanova, votes are recorded in new blocks that validators produce. To understand how consensus finding proceeds, we define the notion of a “voting round” or a “round.” Voting on consensus proceeds in rounds until consensus is achieved. Each transaction domain in which a conflict occurs receives its own sequence of voting rounds, independent of all other transaction domains.
Observe that because transaction domains are independent in this way, there may be multiple rounds proceeding at the same time, each of which applies to a conflict arising from a different transaction domain. The consensus finding for one transaction domain will not need to wait for consensus finding in another domain to complete. As a result, consensus finding should proceed at the maximum possible rate given the network’s bandwidth and timing parameters.
Once a validator observes a conflict, it begins round 0 by creating a block that would have had conflicts as ancestors. Everything that occured before this round is treated as round -1. Round 0 is 2 blocks long and round k is (k + 2) blocks long. Assuming the network has a finite, bounded delay to communicate blocks, consensus finding should complete in the round that is two times that delay. We don’t need to know what that delay is or even have an estimate of it.
In any given round, when a validator observes one of the conflicting transactions with a score greater than or equal to a fault-tolerant-majority (FTM), that validator obtains a lock on that transaction. Each lock has an associated round number and validators may only have a single lock per transaction domain. Validators also take a lock if they observe an FTM score in a previous round that is no longer current, which may happen if network delay is increased.
When validators observe a transaction with a score of FTM or greater in some later round, they release their earlier lock and take a new lock on the transaction. If the transaction is the same they already locked, this merely updates the round number. Locks constrain the votes of validators.
Casanova’s blockdag allows each validator to not only compute the score from its own perspective, but also from the perspective of each other validator in the network at the time they created their last block. In this way, validators not only know what they should do, but can verify that other validators have behaved as they are required to behave.
Casanova uses the following decision criteria:
When a validator sees a set of validators S, in which every validator in S saw a score of FTM or greater, and when S is an FTM, then the validator decides.
We call sets such as S “FTM-observed sets.” For resolving conflicts through consensus finding, the FTM-observed set must all belong to the same round. When this occurs, the validator decides and consensus for that conflict is achieved.
Amortized Message Cost
One of the beauties of Casanova is that the amortized message cost for transactions that do not conflict is a single block. Under normal circumstances, the transactions in a block will be confirmed after the network has produced 2 times an FTM additional blocks. In the best case, this will be around 2D.
Hopefully, at this point, you have a reasonable understanding of how Casanova operates, its features, and some of the design decisions we made as we constructed it. We believe that because of its optimistic consensus, its message pipelining, and its ability to perform consensus finding in parallel, Casanova is uniquely suited for certain blockchain applications.
Pyrofex and its research team are available to assist you with your blockchain platform projects, research activities, and other select software engineering projects. Please feel free to reach out to us regarding Casanova or projects of your own.