perf(limit): cache net_usage in FrameLimitTracker#298
Conversation
The per-opcode hot path runs AdditionalLimit::check_limit on every instruction, which in turn invokes each of the four sub-trackers' check_limit -> tx_usage -> FrameLimitTracker::net_usage. The previous implementation recomputed net_usage by walking the entire frame stack and calling FrameLimitEntry::used() (checked_add + expect) on every entry, making it O(depth) per tracker per opcode -- even though only compute_gas mutates its state on a typical arithmetic/stack/control opcode. Maintain Σ(persistent + discardable) and Σ refund incrementally on FrameLimitTracker, so net_usage() is a single saturating_sub. All mutations now go through four cache-aware helpers (add_tx_persistent, add_frame_persistent, add_frame_discardable, add_frame_refund) and the existing pop_frame, which subtracts the child's discardable_usage and refund from the cache on revert (those values vanish rather than being merged into the parent). The four sub-trackers are updated to call the helpers instead of touching FrameLimitEntry fields directly. Behavior is unchanged: net_usage uses saturating_sub as before, so the transient negative state (refund > used) observed in state_growth still clamps to zero externally. The unused tx_mut accessor is removed.
|
Labels
|
Codecov Report✅ All modified and coverable lines are covered by tests. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
The cache invariant is correctly maintained across all paths:
One gap against CLAUDE.md policy: the benchmark checkbox is explicitly unchecked. The guideline requires hot-path changes to be validated locally with |
|
/benchmark |
Criterion Benchmark Comparison147 benchmarks total, 72 with >5% change
Significant changes (>5%):
147 benchmarks: 0 regressions, 0 warnings, 38 improvements |
Make FrameLimitEntry budget fields module-private so the cache invariant is type-system enforced. Add a debug-only net_usage_uncached() reference walk and assert per-call equivalence so any future mutation bypassing a helper trips on every opcode in debug/test builds. Add a focused unit test exercising push/mutate/pop sequences across success and revert.
|
LGTM. The cache invariant is correctly maintained across every mutation path — helpers update both the entry and the cache atomically, and |
lychee >=0.24 changed `include_fragments` from bool to enum. Use "full" to preserve the previous behavior (anchor + text fragments).
|
LGTM. All prior feedback addressed — benchmarks ran and confirmed (up to 38% improvement on call-heavy workloads, no regressions). Cache invariant is sound: helpers atomically update both the entry and the cached totals, |
Summary
AdditionalLimit::check_limitruns on every opcode and fans out to four sub-trackers, each previously walking the full frame stack insideFrameLimitTracker::net_usage— making the hot path O(depth) per tracker per opcode even when onlycompute_gasactually mutated state.Σ(persistent + discardable)andΣ refundincrementally onFrameLimitTrackerviacached_total_used/cached_total_refund, sonet_usage()becomes a singlesaturating_sub.add_tx_persistent,add_frame_persistent,add_frame_discardable,add_frame_refund) and the existingpop_frame, which on revert subtracts the child'sdiscardable_usageandrefundfrom the cache (those values vanish rather than being merged into the parent). The four sub-trackers (compute_gas,data_size,kv_update,state_growth) are updated to call the helpers instead of touchingFrameLimitEntryfields directly. The unusedtx_mutaccessor is removed.Behavior
net_usage()still usessaturating_sub, so the transientrefund > usedstate observed instate_growthcontinues to clamp to zero externally.pop_frameas before.Test plan
cargo fmt --all --checkcargo clippy --workspace --lib --examples --tests --benches --all-features --lockedcrates/mega-evm/tests/suites covering frame revert / refund accounting (relied on in CI)transactto confirm the hot-path improvement is realized