Skip to content

Cache piece bounds on LocalModelPiece to skip pointer chase in UpdateBoundingVolume#13

Closed
bruno-dasilva wants to merge 2 commits into
masterfrom
bruno/try-optimize-localmodelpiece-2
Closed

Cache piece bounds on LocalModelPiece to skip pointer chase in UpdateBoundingVolume#13
bruno-dasilva wants to merge 2 commits into
masterfrom
bruno/try-optimize-localmodelpiece-2

Conversation

@bruno-dasilva

Copy link
Copy Markdown
Owner

No description provided.

bruno-dasilva and others added 2 commits April 24, 2026 21:26
…-all-reason#2949)

* fix: memory barrier bitfield now pulls from correct parameter

Looks like the original code was:

```
int LuaOpenGL::MemoryBarrier(lua_State* L)
{
	GLbitfield barriers = (GLbitfield)luaL_optint(L, 1, 0);
	//skip checking the correctness of values :)

	if (barriers > 0u)
		glMemoryBarrier(barriers);

	return 0;
}
```

and when it was merged here, the `default` parameter  was changed to a 4 instead of the `idx` parameter.

A Codex review of related code found this bug :)

* Apply suggestion from @bruno-dasilva

---------

Co-authored-by: Bruno Da Silva <Bruno-DaSilva@users.noreply.github.com>
…BoundingVolume

LocalModel::UpdateBoundingVolume's hot loop dereferenced lmPiece.original to
read mins, maxs, and HasGeometryData() on every piece, costing a cache line
per piece in a separately-allocated S3DModelPiece. Mirror those values onto
LocalModelPiece at construction (and in the SetModel reload path) and read
them from there. Hoist the empty-piece check before the modelSpaceTra read so
empty pivot pieces no longer pay the transform load.

Algorithm and computed bounds are unchanged. New fields are CR_IGNORED and
refilled alongside `original` on reload, so save format is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

bar-benchmark — PR #13

candidate ccc962d vs baseline eb1c69f

sim trimmed mean (ms) with 95% CI on the relative delta

scenario candidate baseline Δ (95% CI) n cand n base
fightertest-bots 23.94 ms 23.85 ms ♻️ $\color{red}{+0.08\%} \text{ to } \color{red}{+0.57\%}$ 10 60
fightertest-aircraft 19.17 ms 19.17 ms ♻️ $\color{green}{-0.16\%} \text{ to } \color{red}{+0.21\%}$ 10 60
fightertest-tanks 24.82 ms 24.82 ms ♻️ $\color{green}{-0.30\%} \text{ to } \color{red}{+0.36\%}$ 10 60
fightertest-pathfinding 21.83 ms 21.78 ms ♻️ $\color{green}{-0.08\%} \text{ to } \color{red}{+0.59\%}$ 10 60
lategame1 23.48 ms 23.39 ms ♻️ $\color{green}{-1.13\%} \text{ to } \color{red}{+1.44\%}$ 10 100
Per-VM distribution box plots (5)

fightertest-bots

fightertest-aircraft

fightertest-tanks

fightertest-pathfinding

lategame1

💰 compute cost: $0.58 · 5 fresh legs · 5 cached at $0 last updated: 2026-04-26T21:45:58.081Z · [workflow run](https://github.com/bruno-dasilva/RecoilEngine/actions/runs/24967440620)

@bruno-dasilva

Copy link
Copy Markdown
Owner Author

literally worse
Screenshot 2026-04-26 at 11 51 49 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant