Malicious transaction can cause network node(s) to crash and pause new transactions momentarily
This vulnerability disclosure report contains technical details of the XRP Ledger bug reported on November 25, 2024.
Date Reported: November 2024
Affected Version(s): rippled 2.0.1 to rippled 2.2.x
Summary of Vulnerability
On 13:39 UTC on November 25 2024, the XRP Ledger experienced an issue where several nodes across the network crashed and restarted at similar times. As a result, the network temporarily stopped processing transactions for approximately 10 minutes as it recovered; no funds were lost during this period. The network resumed normal behavior and made forward progress at 13:49 UTC on the same day, after the affected nodes restarted.
Impact
Pause in forward progress while nodes restart, and delays in transaction processing. During periods of instability, the XRPL network consensus model favors safety over progress, and as a result, the network did not process any transactions for approximately 10 minutes as it recovered. There was a pause in the validation of new transactions. There was no loss of funds. The network resumed normal behavior and forward progress at 13:49 UTC on the same day.
Technical Details
Discovery
On 18 November 2024, Mayukha Vadari (a RippleX Engineer) discovered the bug while writing automation testing for the XLS-80 Permissioned Domain code, and further investigation revealed that this same bug could occur for other live transactions as well. In particular, when an account’s ledger object ID was used in the CheckID field of the CheckCash transaction, rippled crashed.
The transactions and fields that caused this issue:
CheckCash
/CheckCancel
(CheckID
)PaymentChannelClaim
(Channel
)NFTokenAcceptOffer
(NFTokenBuyOffer/NFTokenSellOffer
)- The
CredentialID
field in several transactions (this code is not yet live)
The common factor of the vulnerable transactions was that they accepted an ID of an object of a specific type (e.g. check, payment channel, NFT token offer, credential etc.) provided in the transaction and used it to lookup the object referred to by the transaction. This was expected to work, and the worst that could happen (assuming there were no bugs) was that the object found was of the wrong type, which should have been handled the same way as if the object was not found. The bug turned the “found object of unexpected type“ condition into an exception, which crashed the program.
Root Cause
The issue stems from a bug introduced in this refactor from January 2024. This refactor was meant to reduce memory pressure, particularly for pathfinding servers, by changing the type of data stored in the cache from the full serialized ledger entry for some data, to only the relevant key or digest needed to lookup that ledger data. As part of this change, a check on the type of data found in the cache was moved to later in the code.
In some circumstances, the type was not properly checked and data returned by the caching layer was inconsistent, and could cause a server to crash if the object type didn’t match what was expected. This meant that in some transactions, if the transaction was expecting one type of object but received the ID of an object of a different type, and the object was in the server’s cache, the server would crash.
Although this bug went undetected during testing of this refactor, there is no evidence of exploitation prior to this incident.
The original investigation believed that the spread would be contained to only the node that received the bad transaction, since it would crash. However, this proved to be incorrect (likely due to the less-than-predictable nature of caching) during the 25 November incident.
Triggering Transaction
On 25 November, 3 identical PaymentChannelClaim transactions - other than the sequence number - were submitted that referenced a RippleState (trustline) ledger object (instead of a PayChannel object, as expected). These all failed successfully, without crashing the ledger. Then the same account submitted a TrustSet for that RippleState object, and submitted the same PaymentChannelClaim transaction again. The TrustSet transaction made the RippleState ledger object more likely to be in the cache, which then allowed the following PaymentChannelClaim transaction to crash any node that had it in its cache. One of the transactions can be seen in ledger 92346897.
The current hypothesis for the spread is that some nodes had flushed that RippleState ledger object out of their cache and were thus able to process the transaction and propagate it to other nodes. This would also explain why there were still some crashes later when nodes started catching up again, and why the crashes eventually did stop (after the crash, a node wouldn’t have its cache populated, so it would be correctly handle the invalid PaymentChannelClaim).
The PaymentChannelClaim transactions without a valid PayChannel object are processed and stored in the ledger in order to collect the transaction fee. This is done to induce some cost for submitting malformed transactions for the network to process, thereby reducing spam. This is why we can see these transactions recorded in the ledger now, and this is how the transactions were propagated to other nodes, causing them to crash.
Remediation
This bug is fixed by properly type checking cached data. Given the relative ease of exploiting the bug, the fix was obfuscated by including it within a larger commit.
These changes were included in version 2.3.0, released on 25 November. The team had originally planned the release for 2 December to avoid publishing it during a US holiday week when engineering support would be less available. The team decided against a 2.2.4 release with just the fix, since QA had already verified 2.3.0, and a smaller release wouldn’t obfuscate the fix as well. Naturally, given the urgency, the release was pushed out after additional last-minute checks and subsequent coordination with UNL validators.
The reason for requiring all nodes to update as soon as possible was that once a node had been upgraded, it would propagate the invalid transaction to other nodes, which could then cause a wave of crashes in nodes that had not yet updated.
Steps to Reproduce
As part of this full disclosure, an example of how to trigger the bug is covered by this unit test, which will be added to the rippled
codebase. In the controlled setting of a unit test, this code sets the state of the cache system such that a specific series of calls would trigger the bug (if the fix were not present). This test was used to confirm the fix in rippled 2.3.0, but was excluded from the release at the time to avoid malicious exploitation of the bug while users upgraded their rippled software.
Fixes / Patches Available
A patch has been implemented to fix the issue and the fix is available in rippled 2.3.0.
If you haven't already done so, we strongly recommend upgrading to 2.3.0 as soon as possible.
Acknowledgements
Thanks to RippleX, Wietse Wind, Jon Nilsen (hat tip to Mayukha Vadari in particular) for helping investigate this vulnerability.
And thanks to the global community of validators, developers, and contributors who keep the XRP Ledger running and help keep the network safe and secure.
References
Contact
For more information or to report further issues, please contact the team at [email protected].
Incident Response Timeline
Key Actions | Timestamp | Description |
---|---|---|
Initial Discovery | November 25 2024 13:39 UTC | Issue identified and reported. |
Mitigation Action Taken | November 25 2024 13:49 UTC | Following a pause in the validation of new transactions, the network resumed normal behavior and forward progress at 13:49 UTC. There was no loss of funds. |
Resolution Completed | November 25 2024 | The vulnerability has been fully mitigated and the fix was made available as part of rippled 2.3.0 release. |
Blog Published | November 27, 2024 | Blog published with private disclosure and included a call to upgrade to rippled 2.3.0. |
Report Published | January 10, 2025 | Report published with technical details after 80% of the network has upgraded to rippled 2.3.0. |