Luajit spike#29
Draft
bruno-dasilva wants to merge 13 commits into
Draft
Conversation
Vendor the LuaJIT 2.1 rolling release (GC64, JIT enabled) under rts/lib/luajit/ as the basis for migrating the engine's Lua off the modified PUC Lua 5.1. Source only; build artifacts and nested .git stripped. Builds and runs standalone on the gcc 13.3 toolchain. First pass intentionally ignores sync determinism to measure the raw performance delta before deciding whether the determinism work is worth it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Repurpose rts/lib/lua so the engine links LuaJIT 2.1 instead of the modified PUC Lua 5.1, without touching the rts/Lua binding layer: - CMakeLists builds LuaJIT via its own Makefile (static, PIC, amalgamated) and wraps it in the `lua` target carrying LuaUser.cpp + a custom-symbol shim. - lua.h/lualib.h/lauxlib.h/luaconf.h become forwarders to LuaJIT's headers (wrapped in extern "C"); lua.h re-declares the fork's custom symbols. - LuaInclude.h: GetLuaContextData now reads the context via lua_getallocf (LuaJIT keeps global_State opaque); the L->errorJmp pcall check becomes a conservative spring_lua_in_pcall() stub; lua_lock/unlock map to the (no-op) LuaMutex* hooks since LuaJIT exposes no lock hooks. - spring_luajit_shim.cpp implements lua_calchash/lua_pushhstring, lua_set_* (no-op io sandbox), and luaL_loadbuffer_privileged. - SerializeLuaState.cpp stubbed: CReg Lua save/load walked PUC internals that do not exist in LuaJIT (disabled; irrelevant to a bot-game bench). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- lauxlib.h forwarder: add PUC 5.1 compat aliases (luaL_reg -> luaL_Reg, luaI_openlib -> luaL_openlib) needed by vendored luasocket. - LuaInclude.h: the fork built on 32-bit LUA_NUMBER (float) / LUA_INTEGER (int) and lua_toboolean->bool; LuaJIT uses double / ptrdiff_t / int. Narrow the engine-facing accessors back at the boundary via function-like macros (lua_tonumber/luaL_checknumber -> float, lua_tointeger/ luaL_checkinteger -> int) and cast the opt-wrappers, so type-exact templates (std::clamp/min) keep compiling. Add spring_lua_toboolean and luaL_checknumber_noassert adapters. spring-headless now builds and links LuaJIT (GC64, JIT on). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
BAR content (e.g. modules/lava.lua) contains string escapes like \^ that
Lua 5.1 silently accepts (drops the backslash, keeps the char). LuaJIT
follows Lua 5.2+ and errors ("invalid escape sequence"), which aborted
LuaRules/LuaUI loading. Patch lj_lex.c's lex_string to fall through and
save the char for unknown non-digit escapes, matching 5.1. Malformed
\x/\u/\ddd escapes keep their own errors.
Also make the LuaJIT archive rebuild when any vendored source changes.
With this, spring-headless runs the fightertest benchmark startscript to
completion (frame 2100, clean exit).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d hash)
Make the synced VM deterministic across clients so LuaJIT can drive synced
gameplay, while unsynced keeps the full JIT:
- Fixed string-hash seed: build LuaJIT with LUAJIT_SECURITY_{PRNG,STRHASH,
STRID}=0 so g->str.seed is fixed -> table iteration order (pairs/next) is
identical on every client/CPU/run.
- Deterministic math: route the interpreter's transcendentals and the ^
operator through streflop's bundled fdlibm (bit-identical cross-platform),
via lj_sfm_* wrappers (spring_luajit_detmath.cpp) and vm_x86.dasc redirects.
Exact ops (sqrt/floor/ceil/mod/+-*/) stay native (IEEE-correctly-rounded).
- Synced VM runs interpreter-only: CSyncedLuaHandle::Init calls
luaJIT_setmode(ENGINE|OFF), eliminating JIT-vs-interpreter FP divergence
(e.g. the x^2 -> x*x fold) and CPU-dependent codegen. Unsynced keeps JIT.
- math.random already routes to the engine's synced RNG; ffi is not opened
in synced states.
Builds and runs the fightertest benchmark to completion; sim perf unchanged
(8.77 ms/frame, still ~6% faster than PUC) since synced sim is dominated by
C++ callout bodies, not Lua interpretation. NOTE: cross-platform determinism
still needs validation via the sync-fuzz harness (x64 Linux/Windows, ARM64).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…y uses The previous Tier 1 commit redirected math through streflop in vm_x86.dasc, but TARGET_LJARCH=x64 so the build consumes vm_x64.dasc — the redirects never compiled in (the benchmarked binary still called libc libm). Fixes: - vm_x64.dasc: route math.log/sin/cos/.../pow and the `^` operator through lj_sfm_* (streflop), matching what the PUC fork's lmathlib did. This is the dasc DynASM actually processes on x86_64. - CMakeLists: build only the `libluajit.a` target (LJCORE_O=ljamalg.o), not the full `amalg`/`all`. `all` also links the standalone luajit exe, which fails: lj_sfm_* are defined in liblua's detmath (C++/streflop), not LuaJIT's C world. Also drop *.h from the dependency GLOB (a `make clean` deletes generated headers, leaving phantom DEPENDS with no rule). - Split detmath into its own archive linked AFTER libluajit.a: the dependency is one-way (libluajit.a -> detmath -> streflop), so detmath must follow on the link line or the linker discards lj_sfm_* before the reference appears. - detmath: forward the global ::fastiroot(double) to streflop_libm::fastiroot. mpsqrt.cpp declares fastiroot at global scope but defines it in-namespace; the engine never pulled mpsqrt.o before, but streflop::pow does. Shimmed here to keep the spike self-contained rather than patching the streflop submodule. Verified: all 15 lj_sfm_* land in the binary, lj_ff_math_pow calls lj_sfm_pow, fightertest runs to f=2100 exit 0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Double-typed streflop fdlibm routines (__ieee754_pow et al.) return garbage in the engine's synced FPU context (float precision, streflop_init<Simple> per LuaUser.cpp) -- e.g. pow(2,3) came back as 92581. This silently broke every BAR Lua path that relies on the `^` operator for exact integer powers; notably base64Decode (which does 2^shift), so scenario/benchmark JSON failed to decode and the fightertest battle never spawned any units. The earlier "6% sim / 24% draw" numbers were measured against that empty simulation and are invalid. Match the PUC fork exactly: its lmathlib routed math through streflop's Simple (32-bit float) functions, which agree with the ambient float-precision FPU mode and are the precision BAR's synced code was written against. Wrapping args in streflop::Simple fixes the decode; fightertest now spawns ~8160 units and runs to f=2100, matching PUC's ~8260 within float-RNG drift. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tree) LuaJIT's Makefile builds strictly in-tree, writing intermediate objects (host/minilua.o) and generated headers (luajit.h, lj_vm.S, lj_*def.h) into src/. The official docker build mounts the source read-only, so the in-tree make failed with 'can't create host/minilua.o: Read-only file system'. Copy the vendored luajit tree into the CMake binary dir and build there (make clean first, since copy_directory doesn't prune stale objects), and repoint consumers' include path at the copy so they find the generated luajit.h. Local writable-source builds behave the same and no longer write artifacts back into the source tree. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
LuaJIT's Makefile detects the target arch at parse time via $(CC) -E lj_arch.h, so even the 'clean' invocation needs a CC that exists. The docker image only has gcc-13, not a bare gcc, so clean aborted with 'Unsupported target architecture'. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The vendored LuaJIT build needs a host compiler for its build-time tools (minilua/buildvm run on the build machine and emit the target VM), plus GNU make. The amd64-windows image had only the mingw cross toolchain and ninja, so the LuaJIT step failed (cmake's 'make' exec'd to nothing, then once make was present, host minilua couldn't find a native cc/libc). - Image: add gcc-13 + libc6-dev (native host toolchain for minilua/buildvm) and make to the amd64-windows Dockerfile. - CMake: when CMAKE_SYSTEM_NAME is Windows, drive LuaJIT's documented cross recipe (HOST_CC=gcc-13, CROSS=x86_64-w64-mingw32-, CC=gcc-posix, TARGET_SYS=Windows) instead of the native 'CC=<compiler> -fPIC' form, which would wrongly build the host tools for Windows. Produces a PE32+ spring.exe with the det-math (lj_sfm_*) symbols linked in. NOTE: the published amd64-windows image sha in images_versions.sh must be rebuilt/re-pinned to include the new gcc-13/make/libc6-dev packages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0b1fafb to
df4fad0
Compare
The vendored LuaJIT cross-build needs gcc-13 + make + libc6-dev in the builder image (host tools for minilua/buildvm). Those packages were added to the amd64-windows image and pushed to this fork's ghcr namespace, so: - pin the new amd64-windows image digest in images_versions.sh - pull the amd64-windows image from ghcr.io/bruno-dasilva in engine-build.yml (linux images still come from beyond-all-reason) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI failed with 'No rule to make target clean' in the LuaJIT step: a stale/partial luajit tree left in the build dir by an earlier run had a truncated Makefile (no clean target), and copy_directory MERGES into an existing dir without pruning, so the leftover survived. Remove the build-dir copy before copying fresh, guaranteeing a pristine from-scratch tree, and drop the fragile 'make clean' step (its only job was pruning copy_directory leftovers). No image rebuild needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The repo-root .gitignore has a bare 'Makefile' rule that ignores every file named Makefile repo-wide, so the vendored rts/lib/luajit/Makefile and rts/lib/luajit/src/Makefile were never committed. Local builds worked off leftover files in the working tree, but a fresh CI checkout had no LuaJIT Makefile -> 'make: No rule to make target libluajit.a'. Force-add both so the cross-build has its Makefile in CI. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
bar-benchmark — PR #29candidate sim trimmed mean (ms) with 95% CI on the relative delta
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.





No description provided.