Stellar Switches To Centralized System After Node Issue Causes Accidental Fork
A failure in its consensus system has caused Stellar to fork, requiring a roll back. Stellar claims that the issue affects Ripple as well, something Ripple denies.
Update: Stellar's Jed McCaleb has issued a statement on the issue. While he does not deny that the code has been changed since its fork from Ripple, he does maintain that the Ripple protocol could be at risk. The "TL;DR" version he posted can be found at the bottom of this article, along with the full version here. The remainder of the article, saving for the quote at the end, is in its original form.
A failure in its consensus system has caused a hard fork of the Stellar network, forcing its creators to switch to a centralized system while it awaits an upgrade. In the meantime, Stellar users who bought and sold Stellar during the affected time may have lost their coins.
In a blog post last night, Stellar released some details on the issue and their potential plans to fix it. The blog post also stated that Ripple, which Stellar is a fork of, has the potential of falling victim to the same issues. Ripple has made its own blog post assuring the community that is not the case.
The problem has to do with nodes and the way Stellar confirms transactions. On both the Stellar and Ripple network users send transactions to nodes that will confirm the transaction with each other, before committing it to the ledger. The problem happened when certain Stellar nodes did not agree. In the Ripple protocol, if the nodes are unable to reach a strong consensus, they essentially start over and attempt to reach a consensus. Probability dictates that with each consecutive attempt, the chances of reaching a consensus increases.
Stellar and Ripple started out with the same consensus algorithm, and creator Jed McCaleb did state on the forums that the consensus algorithm had not been changed. However, a user on Stellartalk.org, bibbit, pointed out a git commit that seems to indicate that the code had in fact been modified. Another user, going by the name meriver, confirmed that the commit had in fact been implemented. Finally, another user, going by the name fintechnophile, managed to reproduce the conditions that would cause the network to move forward and fork on the Stellar network but not on the Ripple network.
The git pull, titled “consensus changes: faster catch up, delay when out of sync” seemed to be designed to help the Stellar network catch up when it falls behind. In addition to what seem to be mostly debug changes, there was a crucial change to the Stellar code that advanced it without complete consensus. According to the experts I've talked with, including Spencer Lievens of Sterlingcoin and Blackcoin Foundation International Director Joshua J. Bouw, the code enabled nodes to agree there was a consensus if more than 80 % of their local peers agreed that there is a consensus. This possibly allowed two versions of the ledger to propagate through the system and could have caused of the disagreement that caused a fork. This issue does not appear in Ripple's own protocol.
Both the experts I talked to and members of the stellartalk.org forum mentioned one section of the code as the key change to the code:
+ // move on asap if we are behind
+ // If 80% of the nodes on your UNL have moved on, you should declare consensus
+ if (((currentFinished * 100) / (currentProposers + 1)) > 80)
+ CondLog (forReal, lsWARNING, LedgerTiming) <<
+ "We see no consensus, but 80% of nodes have moved on";
+ failed = true;
+ return true;
+ if (currentAgreeTime <= LEDGER_MIN_CONSENSUS_TIME)
It cannot be confirmed that this was the cause of Stellar's problems, but it does confirm that the code was changed since Stellars' fork from Ripple, and does seem to match up with the issue. The Ripple explanation on why this issue doesn't affect them included node behavior, behavior that appears to have been modified.
One Stellar accepting Thai exchange, BX.in.th, already stated that it would not be making changes to its accounts to reverse transactions affected in the subsequent rollback.
“We will not be revisiting transactions made during that ledger fork.
What does this mean for you?
If you made a Stellar deposit during the fork window (December 1, 8:34 PST - December 2, 12:48 AM PST) and it later rolled back to your account; congratulations you have managed to double your money
If you made a Stellar withdrawal during the fork window, and if the withdrawal later rolled back so that it got removed from your stellar balance; sorry it is not your lucky day. We will not be re-sending these withdrawals.
Fortunately we have yet to have any inquiries about missing payments during the above windows; but if inquiries should surface this is our official position.
The above seems the only feasible approach for us to take, as the other options would involve either incurring, possibly significant, loses or trying to recover funds from users who got rolled back deposits.”
Stellar is taking steps to ensure the issue doesn't happen again. They are temporarily switching to a centralized system, with only one node, until a software update can be pushed.
“One validator node to ensure no ledger forks: This situation has led us to believe it is no longer safe to run the existing Ripple/Stellar consensus system with more than one validating node because doing so would expose funds in the network to potential double spends and ledger forks. To ensure no ledger forks going forward in Stellar, we have decided to temporarily only run one validating node until the new consensus algorithm is live. Therefore, like the previous partial payments flag issue, this risk will no longer exist in Stellar.
Prioritization of development resources: Given this real world occurrence of the consensus system’s previously theoretical risks, it is clear that we must prioritize the development of the new Stellar consensus algorithm and move away from the legacy consensus system to increase safety. The new Stellar consensus algorithm will not only be provably correct but also prioritize safety and fault tolerance over guaranteed termination. We believe this is a better choice since it is preferable for the system to pause than to enter divergent and contradictory states. You can keep abreast of progress on this via our Github.”
We are awaiting a response from Stellar about these concerns and will update this space as necessary.
Update Jed MeCaleb's response:
"The ripple paper and the code are not the same. The ripple code causes nodes to determine “quorum” from the nodes it heard from in its last ledger close, not from its total UNL. This is what likely causes forks and this code is live in both consensus systems.
I’m not trying to get into this blaming back and forth with Ripple Inc. We have an obligation to make it clear to people what we have seen and what we believe the issues are. When we first reached out to Prof Mazieres, it was to get an independent 3rd party review of the ripple algorithm from a respected computer scientist so we could be certain that it worked. This is what should be done for any complicated algorithm. Bitcoin has had its paper rigorously reviewed and generally has passed all such review on a technical level. We wanted to do the same thing but unfortunately for us, the algorithm did not pass Prof. Mazieres’s review and we do not know any distributed systems expert who is not employed by Ripple Inc who has reviewed the algorithm and thinks it works."
The rest of McCaleb's statement can be read here.
Did you enjoy this article? You may also be interested in reading these ones: