Skip to content

Amendment-blocked nodes may crash on unknown serialized fields during LCL JUMP #6492

Description

@sublimator

Amendment-blocked nodes may crash on unknown serialized fields during JUMP

Summary

This issue was discovered and reproduced on Xahau (xahaud#706). Rippled appears to have the same vulnerable code paths but this has not been tested on rippled.

Amendment-blocked nodes may crash with an uncaught std::runtime_error when a JUMP (LCL switch) forces deserialization of transactions containing fields unknown to the older binary. The node correctly detects it is amendment-blocked, but the switchLastClosedLedger code path lacks exception handling, unlike doAdvance which catches the same error.

Crash Path

NetworkOPsImp::checkLastClosedLedger
  → switchLastClosedLedger
    → TxQ::processClosedLedger
      → FeeMetrics::update
        → std::for_each over view.txs
          → deserializeTxPlusMeta
            → STTx constructor
              → STObject::set
                → Throw<std::runtime_error>("Unknown field")

Potentially Vulnerable Code

1. switchLastClosedLedger — no try/catch

The call to processClosedLedger has no exception handling:

📍 src/xrpld/app/misc/NetworkOPs.cpp:1855-1905

1855 void
1856 NetworkOPsImp::switchLastClosedLedger(std::shared_ptr<Ledger const> const& newLCL)
1857 {
1858     // set the newLCL as our last closed ledger -- this is abnormal code
1859     JLOG(m_journal.error()) << "JUMP last closed ledger to " << newLCL->header().hash;
1860 
1861     clearNeedNetworkLedger();
1862 
1863     // Update fee computations.
1864     registry_.getTxQ().processClosedLedger(registry_.app(), *newLCL, true);
1865 
1866     // Caller must own master lock
1867     {
1868         // Apply tx in old open ledger to new
1869         // open ledger. Then apply local tx.
1870 
1871         auto retries = m_localTX->getTxSet();
1872         auto const lastVal = registry_.getLedgerMaster().getValidatedLedger();
1873         std::optional<Rules> rules;
1874         if (lastVal)
1875             rules = makeRulesGivenLedger(*lastVal, registry_.app().config().features);
1876         else
1877             rules.emplace(registry_.app().config().features);
1878         registry_.openLedger().accept(
1879             registry_.app(),
1880             *rules,
1881             newLCL,
1882             OrderedTxs({}),
1883             false,
1884             retries,
1885             tapNONE,
1886             "jump",
1887             [&](OpenView& view, beast::Journal j) {
1888                 // Stuff the ledger with transactions from the queue.
1889                 return registry_.getTxQ().accept(registry_.app(), view);
1890             });
1891     }
1892 
1893     m_ledgerMaster.switchLCL(newLCL);
1894 
1895     protocol::TMStatusChange s;
1896     s.set_newevent(protocol::neSWITCHED_LEDGER);
1897     s.set_ledgerseq(newLCL->header().seq);
1898     s.set_networktime(registry_.timeKeeper().now().time_since_epoch().count());
1899     s.set_ledgerhashprevious(
1900         newLCL->header().parentHash.begin(), newLCL->header().parentHash.size());
1901     s.set_ledgerhash(newLCL->header().hash.begin(), newLCL->header().hash.size());
1902 
1903     registry_.overlay().foreach(
1904         send_always(std::make_shared<Message>(s, protocol::mtSTATUS_CHANGE)));
1905 }

2. tryAdvance — has try/catch (survives the same error)

Compare with tryAdvance which wraps doAdvance in a try/catch:

📍 src/xrpld/app/ledger/detail/LedgerMaster.cpp:1284-1316

1284 void
1285 LedgerMaster::tryAdvance()
1286 {
1287     std::lock_guard ml(m_mutex);
1288 
1289     // Can't advance without at least one fully-valid ledger
1290     mAdvanceWork = true;
1291     if (!mAdvanceThread && !mValidLedger.empty())
1292     {
1293         mAdvanceThread = true;
1294         app_.getJobQueue().addJob(jtADVANCE, "AdvanceLedger", [this]() {
1295             std::unique_lock sl(m_mutex);
1296 
1297             XRPL_ASSERT(
1298                 !mValidLedger.empty() && mAdvanceThread,
1299                 "xrpl::LedgerMaster::tryAdvance : has valid ledger");
1300 
1301             JLOG(m_journal.trace()) << "advanceThread<";
1302 
1303             try
1304             {
1305                 doAdvance(sl);
1306             }
1307             catch (std::exception const& ex)
1308             {
1309                 JLOG(m_journal.fatal()) << "doAdvance throws: " << ex.what();
1310             }
1311 
1312             mAdvanceThread = false;
1313             JLOG(m_journal.trace()) << "advanceThread>";
1314         });
1315     }
1316 }

3. FeeMetrics::update — iterates view.txs, triggering deserialization

📍 src/xrpld/app/misc/detail/TxQ.cpp:64-142

  64 std::size_t
  65 TxQ::FeeMetrics::update(
  66     Application& app,
  67     ReadView const& view,
  68     bool timeLeap,
  69     TxQ::Setup const& setup)
  70 {
  71     std::vector<FeeLevel64> feeLevels;
  72     auto const txBegin = view.txs.begin();
  73     auto const txEnd = view.txs.end();
  74     auto const size = std::distance(txBegin, txEnd);
  75     feeLevels.reserve(size);
  76     std::for_each(txBegin, txEnd, [&](auto const& tx) {
  77         feeLevels.push_back(getFeeLevelPaid(view, *tx.first));
  78     });
  79     std::sort(feeLevels.begin(), feeLevels.end());
  80     XRPL_ASSERT(size == feeLevels.size(), "xrpl::TxQ::FeeMetrics::update : fee levels size");
  81 
  82     JLOG((timeLeap ? j_.warn() : j_.debug()))
  83         << "Ledger " << view.header().seq << " has " << size << " transactions. "
  84         << "Ledgers are processing " << (timeLeap ? "slowly" : "as expected")
  85         << ". Expected transactions is currently " << txnsExpected_ << " and multiplier is "
  86         << escalationMultiplier_;
  87 
  88     if (timeLeap)
  89     {
  90         // Ledgers are taking to long to process,
  91         // so clamp down on limits.
  92         auto const cutPct = 100 - setup.slowConsensusDecreasePercent;
  93         // upperLimit must be >= minimumTxnCount_ or std::clamp can give
  94         // unexpected results
  95         auto const upperLimit = std::max<std::uint64_t>(
  96             mulDiv(txnsExpected_, cutPct, 100).value_or(xrpl::muldiv_max), minimumTxnCount_);
  97         txnsExpected_ = std::clamp<std::uint64_t>(
  98             mulDiv(size, cutPct, 100).value_or(xrpl::muldiv_max), minimumTxnCount_, upperLimit);
  99         recentTxnCounts_.clear();
 100     }
 101     else if (size > txnsExpected_ || size > targetTxnCount_)
 102     {
 103         recentTxnCounts_.push_back(mulDiv(size, 100 + setup.normalConsensusIncreasePercent, 100)
 104                                        .value_or(xrpl::muldiv_max));
 105         auto const iter = std::max_element(recentTxnCounts_.begin(), recentTxnCounts_.end());
 106         BOOST_ASSERT(iter != recentTxnCounts_.end());
 107         auto const next = [&] {
 108             // Grow quickly: If the max_element is >= the
 109             // current size limit, use it.
 110             if (*iter >= txnsExpected_)
 111                 return *iter;
 112             // Shrink slowly: If the max_element is < the
 113             // current size limit, use a limit that is
 114             // 90% of the way from max_element to the
 115             // current size limit.
 116             return (txnsExpected_ * 9 + *iter) / 10;
 117         }();
 118         // Ledgers are processing in a timely manner,
 119         // so keep the limit high, but don't let it
 120         // grow without bound.
 121         txnsExpected_ = std::min(next, maximumTxnCount_.value_or(next));
 122     }
 123 
 124     if (!size)
 125     {
 126         escalationMultiplier_ = setup.minimumEscalationMultiplier;
 127     }
 128     else
 129     {
 130         // In the case of an odd number of elements, this
 131         // evaluates to the middle element; for an even
 132         // number of elements, it will add the two elements
 133         // on either side of the "middle" and average them.
 134         escalationMultiplier_ =
 135             (feeLevels[size / 2] + feeLevels[(size - 1) / 2] + FeeLevel64{1}) / 2;
 136         escalationMultiplier_ = std::max(escalationMultiplier_, setup.minimumEscalationMultiplier);
 137     }
 138     JLOG(j_.debug()) << "Expected transactions updated to " << txnsExpected_
 139                      << " and multiplier updated to " << escalationMultiplier_;
 140 
 141     return size;
 142 }

4. deserializeTxPlusMeta — constructs STTx which can throw

📍 src/xrpld/app/ledger/Ledger.cpp:359-373

 359 std::pair<std::shared_ptr<STTx const>, std::shared_ptr<STObject const>>
 360 deserializeTxPlusMeta(SHAMapItem const& item)
 361 {
 362     std::pair<std::shared_ptr<STTx const>, std::shared_ptr<STObject const>> result;
 363     SerialIter sit(item.slice());
 364     {
 365         SerialIter s(sit.getSlice(sit.getVLDataLength()));
 366         result.first = std::make_shared<STTx const>(s);
 367     }
 368     {
 369         SerialIter s(sit.getSlice(sit.getVLDataLength()));
 370         result.second = std::make_shared<STObject const>(s, sfMetadata);
 371     }
 372     return result;
 373 }

5. STObject::set — throws on unknown fields

📍 src/libxrpl/protocol/STObject.cpp:209-269

 209 bool
 210 STObject::set(SerialIter& sit, int depth)
 211 {
 212     bool reachedEndOfObject = false;
 213 
 214     v_.clear();
 215 
 216     // Consume data in the pipe until we run out or reach the end
 217     while (!sit.empty())
 218     {
 219         int type;
 220         int field;
 221 
 222         // Get the metadata for the next field
 223         sit.getFieldID(type, field);
 224 
 225         // The object termination marker has been found and the termination
 226         // marker has been consumed. Done deserializing.
 227         if (type == STI_OBJECT && field == 1)
 228         {
 229             reachedEndOfObject = true;
 230             break;
 231         }
 232 
 233         if (type == STI_ARRAY && field == 1)
 234         {
 235             JLOG(debugLog().error()) << "Encountered object with embedded end-of-array marker";
 236             Throw<std::runtime_error>("Illegal end-of-array marker in object");
 237         }
 238 
 239         auto const& fn = SField::getField(type, field);
 240 
 241         if (fn.isInvalid())
 242         {
 243             JLOG(debugLog().error())
 244                 << "Unknown field: field_type=" << type << ", field_name=" << field;
 245             Throw<std::runtime_error>("Unknown field");
 246         }
 247 
 248         // Unflatten the field
 249         v_.emplace_back(sit, fn, depth + 1);
 250 
 251         // If the object type has a known SOTemplate then set it.
 252         if (auto const obj = dynamic_cast<STObject*>(&(v_.back().get())))
 253             obj->applyTemplateFromSField(fn);  // May throw
 254     }
 255 
 256     // We want to ensure that the deserialized object does not contain any
 257     // duplicate fields. This is a key invariant:
 258     auto const sf = getSortedFields(*this, withAllFields);
 259 
 260     auto const dup =
 261         std::adjacent_find(sf.cbegin(), sf.cend(), [](STBase const* lhs, STBase const* rhs) {
 262             return lhs->getFName() == rhs->getFName();
 263         });
 264 
 265     if (dup != sf.cend())
 266         Throw<std::runtime_error>("Duplicate field detected");
 267 
 268     return reachedEndOfObject;
 269 }

Root Cause

TxQ::FeeMetrics::update() iterates view.txs to collect fee levels. The ReadView::txs range eagerly deserializes every transaction via deserializeTxPlusMetaSTTx constructor → STObject::set. When a transaction contains a serialized field unknown to the binary, STObject::set throws std::runtime_error.

LedgerMaster::tryAdvance has a try/catch that survives this (logs doAdvance throws), but switchLastClosedLedger does not — the exception propagates up and terminates the process.

Expected Behavior (based on Xahau reproduction)

The following was observed on a Xahau testnet (see xahaud#706). The same code paths exist in rippled but this has not been independently reproduced.

  1. Node correctly detects unsupported amendment and logs server blocked
  2. doAdvance repeatedly throws and recovers via its try/catch
  3. Node falls behind the network; checkLastClosedLedger detects LCL divergence from peer consensus
  4. JUMP triggered: switchLastClosedLedgerprocessClosedLedger → crash
  5. libc++abi: terminating due to uncaught exception of type std::runtime_error: Unknown field

Metadata

Metadata

Assignees

Labels

BugTriagedIssue/PR has been triaged for viability, liveliness, etc.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions