perf: move nano particles out of sim frame by bruno-dasilva · Pull Request #7969 · beyond-all-reason/Beyond-All-Reason

bruno-dasilva · 2026-06-16T00:17:06Z

NOTE: Draft until full testing is complete + screenshots/vids are added.

Context

the nano particles GL4 currently runs a bunch of work in the ::GameFrame callin, which runs during sim frames. This slows down sim particularly during catchup. So the idea is: let's move it out of ::GameFrame and into ::Update.

FYI to readers the way that "frames" work inside engine is at the bottom in the "Addendum" section.

Work done

Two things (in two commits):

move the nano particle logic out of GameFrame() callin into Update(). This has a slight change in complexity:
- since an Update() frame can run after potentially many sim frames (in catch-up), we now need to integrate over multiple sim frames, too instead of just over the existing amortization striding.
add an explicit ramping down of particles as the pool saturates to match the old (implicit) ramping

Performance Table (TODO)

time	before	after	diff
0-5min
5-10min
10-20min
20-30min
30-60min

Test steps

For all of the above, check behaviour before/after this change (it should look similar if not identical)

Look at a large number of nanos after at a bunch of different game speeds
Look at a large number of nanos while paused
Run replays of a few games at max speed and see what the performance difference is

See videos:
MASTER
- EARLY GAME: https://drive.google.com/file/d/1yuE9dKRcM_90_Ya9ivxHfH8ttSQWjRLb/view?usp=drive_link
- MID GAME: https://drive.google.com/file/d/1FCWuzqZy43l0xsqxS3lNyecWvnd4ofVc/view?usp=drive_link
- LATE GAME: https://drive.google.com/file/d/13IrT1ibgL3_lrLQPxG6UA56-YxcPVPIb/view?usp=drive_link
THIS BRANCH:
- EARLY GAME: https://drive.google.com/file/d/1QjQyhdOIYiO3W40w_1moubon41VUiMpX/view?usp=drive_link
- MID GAME: https://drive.google.com/file/d/1BlqFTP5NYYWZSDTCk5iajYmQADQSW72r/view?usp=drive_link
- LATE GAME: https://drive.google.com/file/d/1fmyun2z7yYc354sZX5DPqNNY735Isz1H/view?usp=drive_link

Screenshots:

BEFORE:

AFTER:

AI / LLM usage statement:

Claude Code to do the initial POC/implementation, significant cleanup and comments by me.

Addendum - how engine runs frames

The main loop is Update → Draw, repeating. Each iteration produces one draw frame and drains any queued sim frame packets first (0..N sim frames per iteration). CGame::Update (synced) dispatches SimFrame() calls as NETMSG_NEWFRAME packets arrive; CGame::Draw then does an unsynced update phase (CGame::UpdateUnsynced: timings, interpolation, camera, GUI, sound, world-drawer prep) followed by rendering (DrawGenesis → DrawScreenPost). The sim burst is capped at ~500 ms (minDrawFPS) so draw always gets to run. It's all one thread — sim and rendering are not concurrent; parallelism only happens inside a phase.

Conversely, if no sim frames are in the queue the main loop runs Draw/UpdateUnsynced as fast as possible — many draw iterations can pass between successive sim frames, with visuals interpolating smoothly in between via globalRendering->timeOffset.

main-loop iteration  (repeats as fast as possible)
├── CGame::Update            (synced)
│   └── SimFrame × 0..N      ← processes queued sim frames capped at
|                              ~500ms per iteration
└── CGame::Draw              (unsynced)
    ├── UpdateUnsynced       ← unsynced update phase
    └── DrawGenesis → DrawScreenPost  ← render world + screen

Phase	Rate	Synced?	Responsibility
Sim frame — `CGame::SimFrame`	fixed 30 Hz (`GAME_SPEED`)	yes	advance deterministic state: units, pathing, projectiles, line-of-sight, scripts, Lua `GameFrame`
Draw frame — `CGame::Draw`	variable	no	update phase (see below) + render world/screen
Update phase — `CGame::UpdateUnsynced` (inside draw frame)	per draw frame	no	timings, interpolation, camera, GUI, sound, world-drawer prep

Benchmark names. fightertest reports these phases as three peer buckets Sim / Update / Render — Sim ≈ CGame::SimFrame, Update ≈ CGame::UpdateUnsynced, Render ≈ DrawGenesis → DrawScreenPost.

…eFrame Move the per-tick particle refresh (emit/cull/homing + VBO upload) off the sim critical path into the unsynced gadget:Update callin, gated to run at most once per sim frame. The main loop drains 0..N queued sim frames before each draw, so one Update may cover several sim frames under fast-forward / catch-up; that is handled by: - a boundary-crossing gate (`crossed`) for the periodic polls, instead of exact `n % K == 0` which a frame jump could step over; - a per-Update `tick` counter driving the amortized scan/homing/clamp cadences, so they stay even regardless of how far `n` jumps; - per-builder `elapsed` integration so emission density stays proportional to buildpower*time across throttling and frame jumps; - a range-sweeping `cullDead` that drains every `deathBuckets` frame in `(prev, n]`, not just `[n]`, so a jump can't strand VBO slots. Removes the now-unnecessary high-gamespeed emission throttle: the work is draw-rate-bound under Update, so that pressure is gone. Comment/doc cleanups throughout.

Add a saturation-driven emit keep-factor (satKeep) so particle density thins evenly as the pool fills, rather than every spray staying full until the hard cap cuts emission dead. Derived from the continuous (un-floored) 1/(runEvery*stride) form, it reproduces the old per-gameframe saturation thinning smoothly: 1.0 at an empty pool, ~1/6 at full. The pool then self-stabilises below MAX_PARTICLES, demoting the hard cap to a safety net.

github-actions · 2026-06-16T00:21:02Z

Integration Test Results

16 tests ±0 8 ✅ ±0 3s ⏱️ ±0s
1 suites ±0 8 💤 ±0
1 files ±0 0 ❌ ±0

Results for commit 8b81049. ± Comparison against base commit 2b7b14b.

♻️ This comment has been updated with latest results.

bruno-dasilva · 2026-06-16T06:14:32Z

So one thing that actually makes this maybe not be as much of an improvement is there was already throttling of work for when speed was > 1. So this pushes a usually-once-per-update to a strictly-one-per-update which limits the upside benefit.

bruno-dasilva added 2 commits June 16, 2026 00:17

bruno-dasilva force-pushed the bruno/move-nano-particles-to-update-frame branch from 53af0d6 to 4e80a60 Compare June 16, 2026 01:24

Merge branch 'master' into bruno/move-nano-particles-to-update-frame

8b81049

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: move nano particles out of sim frame #7969

perf: move nano particles out of sim frame #7969
bruno-dasilva wants to merge 3 commits into
beyond-all-reason:masterfrom
bruno-dasilva:bruno/move-nano-particles-to-update-frame

bruno-dasilva commented Jun 16, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

bruno-dasilva commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bruno-dasilva commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Work done

Test steps

Screenshots:

BEFORE:

AFTER:

AI / LLM usage statement:

Addendum - how engine runs frames

Uh oh!

github-actions Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Integration Test Results

Uh oh!

bruno-dasilva commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bruno-dasilva commented Jun 16, 2026 •

edited

Loading

github-actions Bot commented Jun 16, 2026 •

edited

Loading