Last updated

Malicious transaction can cause network node(s) to crash and pause new transactions momentarily

This vulnerability disclosure report contains technical details of the XRP Ledger bug reported on November 25, 2024.

Date Reported: November 2024

Affected Version(s): rippled 2.0.1 to rippled 2.2.x

Summary of Vulnerability

On 13:39 UTC on November 25 2024, the XRP Ledger experienced an issue where several nodes across the network crashed and restarted at similar times. As a result, the network temporarily stopped processing transactions for approximately 10 minutes as it recovered; no funds were lost during this period. The network resumed normal behavior and made forward progress at 13:49 UTC on the same day, after the affected nodes restarted.

Impact

Pause in forward progress while nodes restart, and delays in transaction processing. During periods of instability, the XRPL network consensus model favors safety over progress, and as a result, the network did not process any transactions for approximately 10 minutes as it recovered. There was a pause in the validation of new transactions. There was no loss of funds. The network resumed normal behavior and forward progress at 13:49 UTC on the same day.

Technical Details

Discovery

On 18 November 2024, Mayukha Vadari (a RippleX Engineer) discovered the bug while writing automation testing for the XLS-80 Permissioned Domain code, and further investigation revealed that this same bug could occur for other live transactions as well. In particular, when an account’s ledger object ID was used in the CheckID field of the CheckCash transaction, rippled crashed.

The transactions and fields that caused this issue:

  • CheckCash/CheckCancel (CheckID)
  • PaymentChannelClaim (Channel)
  • NFTokenAcceptOffer (NFTokenBuyOffer/NFTokenSellOffer)
  • The CredentialID field in several transactions (this code is not yet live)

The common factor of the vulnerable transactions was that they accepted an ID of an object of a specific type (e.g. check, payment channel, NFT token offer, credential etc.) provided in the transaction and used it to lookup the object referred to by the transaction. This was expected to work, and the worst that could happen (assuming there were no bugs) was that the object found was of the wrong type, which should have been handled the same way as if the object was not found. The bug turned the “found object of unexpected type“ condition into an exception, which crashed the program.

Root Cause

The issue stems from a bug introduced in this refactor from January 2024. This refactor was meant to reduce memory pressure, particularly for pathfinding servers, by changing the type of data stored in the cache from the full serialized ledger entry for some data, to only the relevant key or digest needed to lookup that ledger data. As part of this change, a check on the type of data found in the cache was moved to later in the code.

In some circumstances, the type was not properly checked and data returned by the caching layer was inconsistent, and could cause a server to crash if the object type didn’t match what was expected. This meant that in some transactions, if the transaction was expecting one type of object but received the ID of an object of a different type, and the object was in the server’s cache, the server would crash.

Although this bug went undetected during testing of this refactor, there is no evidence of exploitation prior to this incident.

The original investigation believed that the spread would be contained to only the node that received the bad transaction, since it would crash. However, this proved to be incorrect (likely due to the less-than-predictable nature of caching) during the 25 November incident.

Triggering Transaction

On 25 November, 3 identical PaymentChannelClaim transactions - other than the sequence number - were submitted that referenced a RippleState (trustline) ledger object (instead of a PayChannel object, as expected). These all failed successfully, without crashing the ledger. Then the same account submitted a TrustSet for that RippleState object, and submitted the same PaymentChannelClaim transaction again. The TrustSet transaction made the RippleState ledger object more likely to be in the cache, which then allowed the following PaymentChannelClaim transaction to crash any node that had it in its cache. One of the transactions can be seen in ledger 92346897.

The current hypothesis for the spread is that some nodes had flushed that RippleState ledger object out of their cache and were thus able to process the transaction and propagate it to other nodes. This would also explain why there were still some crashes later when nodes started catching up again, and why the crashes eventually did stop (after the crash, a node wouldn’t have its cache populated, so it would be correctly handle the invalid PaymentChannelClaim).

Note

The PaymentChannelClaim transactions without a valid PayChannel object are processed and stored in the ledger in order to collect the transaction fee. This is done to induce some cost for submitting malformed transactions for the network to process, thereby reducing spam. This is why we can see these transactions recorded in the ledger now, and this is how the transactions were propagated to other nodes, causing them to crash.

Remediation

This bug is fixed by properly type checking cached data. Given the relative ease of exploiting the bug, the fix was obfuscated by including it within a larger commit.

These changes were included in version 2.3.0, released on 25 November. The team had originally planned the release for 2 December to avoid publishing it during a US holiday week when engineering support would be less available. The team decided against a 2.2.4 release with just the fix, since QA had already verified 2.3.0, and a smaller release wouldn’t obfuscate the fix as well. Naturally, given the urgency, the release was pushed out after additional last-minute checks and subsequent coordination with UNL validators.

The reason for requiring all nodes to update as soon as possible was that once a node had been upgraded, it would propagate the invalid transaction to other nodes, which could then cause a wave of crashes in nodes that had not yet updated.

Steps to Reproduce

As part of this full disclosure, an example of how to trigger the bug is covered by this unit test, which will be added to the rippled codebase. In the controlled setting of a unit test, this code sets the state of the cache system such that a specific series of calls would trigger the bug (if the fix were not present). This test was used to confirm the fix in rippled 2.3.0, but was excluded from the release at the time to avoid malicious exploitation of the bug while users upgraded their rippled software.

Fixes / Patches Available

A patch has been implemented to fix the issue and the fix is available in rippled 2.3.0.

If you haven't already done so, we strongly recommend upgrading to 2.3.0 as soon as possible.

Acknowledgements

Thanks to RippleX, Wietse Wind, Jon Nilsen (hat tip to Mayukha Vadari in particular) for helping investigate this vulnerability.

And thanks to the global community of validators, developers, and contributors who keep the XRP Ledger running and help keep the network safe and secure.

References

Contact

For more information or to report further issues, please contact the team at [email protected].

Incident Response Timeline

Key ActionsTimestampDescription
Initial DiscoveryNovember 25 2024 13:39 UTCIssue identified and reported.
Mitigation Action TakenNovember 25 2024 13:49 UTCFollowing a pause in the validation of new transactions, the network resumed normal behavior and forward progress at 13:49 UTC. There was no loss of funds.
Resolution CompletedNovember 25 2024The vulnerability has been fully mitigated and the fix was made available as part of rippled 2.3.0 release.
Blog PublishedNovember 27, 2024Blog published with private disclosure and included a call to upgrade to rippled 2.3.0.
Report PublishedJanuary 10, 2025Report published with technical details after 80% of the network has upgraded to rippled 2.3.0.