Skip to content

build: raise x86-64 baseline from SSE2 to SSE4.2#4

Open
bruno-dasilva wants to merge 1 commit into
masterfrom
bruno/sse4
Open

build: raise x86-64 baseline from SSE2 to SSE4.2#4
bruno-dasilva wants to merge 1 commit into
masterfrom
bruno/sse4

Conversation

@bruno-dasilva

Copy link
Copy Markdown
Owner

PERFORMANCE TESTING


Default GCC/Clang builds targeted SSE2 via a wall of -mno-sse3/ssse3/sse4.* flags in the generic MARCH fallback. Replace with -msse4.2, which pulls in SSE3/SSSE3/SSE4.1. AVX/FMA stay banned — FMA contraction changes FP bit patterns and desyncs the deterministic simulation.

Minimum x86 CPU is now Nehalem (2008) / Bulldozer (2011). Requires a replay-level sync validation pass before shipping — autovectorization output will differ from prior builds even without FMA.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Default GCC/Clang builds targeted SSE2 via a wall of -mno-sse3/ssse3/sse4.*
flags in the generic MARCH fallback. Replace with -msse4.2, which pulls in
SSE3/SSSE3/SSE4.1. AVX/FMA stay banned — FMA contraction changes FP bit
patterns and desyncs the deterministic simulation.

Minimum x86 CPU is now Nehalem (2008) / Bulldozer (2011). Requires a
replay-level sync validation pass before shipping — autovectorization
output will differ from prior builds even without FMA.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

explicitly list sse4 flags

revert part of the code
@github-actions

github-actions Bot commented Apr 23, 2026

Copy link
Copy Markdown

bar-benchmark — PR #4

candidate 24808db vs baseline eb1c69f

sim trimmed mean (ms) with 95% CI on the relative delta

scenario candidate baseline Δ (95% CI) n cand n base
fightertest-bots 23.91 ms ♻️ 23.86 ms ♻️ $\color{red}{+0.08\%} \text{ to } \color{red}{+0.36\%}$ 50 80
fightertest-aircraft 19.23 ms ♻️ 19.17 ms ♻️ $\color{red}{+0.22\%} \text{ to } \color{red}{+0.36\%}$ 50 70
fightertest-tanks 24.88 ms ♻️ 24.82 ms ♻️ $\color{green}{-0.00\%} \text{ to } \color{red}{+0.44\%}$ 50 70
fightertest-pathfinding 21.82 ms ♻️ 21.77 ms ♻️ $\color{red}{+0.05\%} \text{ to } \color{red}{+0.41\%}$ 50 70
lategame1 23.08 ms 23.39 ms ♻️ $\color{green}{-2.06\%} \text{ to } \color{green}{-0.59\%}$ 50 110

💰 compute cost: $0.28 · 1 fresh leg · 9 cached at $0

last updated: 2026-04-25T15:30:04.654Z · workflow run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant