Skip to content

macOS Apple Silicon support via Mesa Zink → KosmicKrisp → Metal#2991

Draft
iamaperson000 wants to merge 28 commits into
beyond-all-reason:masterfrom
iamaperson000:upstream-pr-macos-apple-silicon
Draft

macOS Apple Silicon support via Mesa Zink → KosmicKrisp → Metal#2991
iamaperson000 wants to merge 28 commits into
beyond-all-reason:masterfrom
iamaperson000:upstream-pr-macos-apple-silicon

Conversation

@iamaperson000

Copy link
Copy Markdown

Summary

Adds an Apple Silicon macOS build path via Mesa 26.2 (Zink driver) → KosmicKrisp (Vulkan-on-Metal) → Metal. Renders BAR end-to-end on macOS 26 / M-series hardware.

⚠️ SINGLE-PLAYER / SKIRMISH ONLY. Multiplayer is NOT validated and is expected to desync. The deterministic-FP infrastructure (streflop on ARM64/NEON, `-ffp-contract=off`) is structurally in place, but a cross-platform demo-replay test against an x86 BAR install has not been run. Until that's verified, Apple Silicon clients should be treated as incompatible with x86 hosts. Do not merge until MP determinism is empirically validated.

💬 Discussion: this PR is an interim macOS bring-up via Mesa Zink. It is not proposed as the permanent macOS path. Issue #2852 outlines a long-term GL/Vulkan/Metal abstraction interface direction — every macOS-specific hunk in this PR is gated behind `#ifdef APPLE` / `if(APPLE)` / `Platform/Mac/` so when the abstraction lands, the Mac-specific code can be deleted wholesale without touching cross-platform code paths.

Companion PRs

Cross-platform safety

All behavior changes are gated by one of:

  • `#ifdef APPLE` / `#if defined(APPLE)`
  • A file under `rts/System/Platform/Mac/` or `include/Mac/`
  • `if(APPLE)` in CMake
  • An env var that defaults to no-op (`SPRING_` and `SPRING_MAC_`)

Universal bug fixes (NOT macOS-specific) are called out per commit:

  • `Rendering: guard CTextureAtlas::GetTexID against null atlasTex` — defensive
  • `GlobalRendering: viewport must match the FBO` — latent HiDPI bug
  • `Window: persist logical (point) size, not the backing-pixel size` — HiDPI fix
  • `GL: skip legacy ARB-extension name check on core profile` — per-spec correctness
  • `Rendering: async PBO readback + downsample / timing knobs` — env-gated perf, no-op when env unset
  • `Rendering: env-gated headless frame capture` — diagnostic
  • A few vendored-lib / toolchain compat fixes

Env vars introduced

Var Scope Purpose Default
`SPRING_FRAME_CAPTURE` any dump backbuffer to raw file off
`SPRING_NO_PBO` any disable async PBO readback off
`SPRING_DOWNSAMPLE_READBACK` any downsample factor before glReadPixels 1
`SPRING_TIME_PRESENT` any per-60-frame present-stage timing off
`SPRING_MAC_LIBEGL` CMake Mesa libEGL.dylib path discovered
`SPRING_MAC_GL_CORE` macOS force core profile instead of compat off
`SPRING_MAC_NO_RETINA` macOS render at logical (1x) resolution off
`SPRING_MAC_LEGACY_PRESENT` macOS glReadPixels fallback present off
`SPRING_MAC_DISABLE_LUAINTRO` macOS opt out of BAR Lua loading screen off
`SPRING_MAC_DUMP_FRAME` macOS dump CAMetalLayer drawable bytes off
`SPRING_MAC_PRESENT_TEST` macOS flash window to sanity-check present off

Tested

  • macOS 26 / Apple M-series: builds clean, BAR boots, lobby renders, single-player skirmish plays
  • Multiplayer cross-platform determinism: NOT TESTED — expected to desync until validated
  • Linux: relies on this CI matrix
  • Windows: relies on this CI matrix

Known limitations

  • Geometry-shader stripping (`LuaShaders.cpp`) is gated to macOS. Custom Lua shaders that depend on GS on macOS will be silently stripped; the BAR companion PR provides a Lua-layer NoGS fallback for the widgets that need it.
  • Mesa is a runtime dependency on macOS (built from source for KosmicKrisp Vulkan-on-Metal). Distribution / `.app` packaging is a separate later effort.
  • Native MoltenVK / Vulkan-direct path is not provided — that is the abstraction-interface direction in Vulkan backend for renderer #2852.
  • Multiplayer expected broken; see top of this description.

Notes for reviewers

  • Suggested review order: build/CMake commits → universal bug fixes → `Platform/Mac/` files → EGL bring-up → Metal present → env-gated perf knobs → LuaShaders GS gate
  • The `APPLE` blocks are wide because the EGL init path is genuinely a parallel implementation of the SDL_GL_CreateContext path. Happy to discuss restructuring once Vulkan backend for renderer #2852's direction lands.

@lostsquirrel1

Copy link
Copy Markdown
Collaborator

I see changes in the libs dir, I would prefer that third party libraries do not receive custom changes on our side: that creates complications if and when we choose to pull later versions. If there's issues with them, then I'd prefer to see them fixed at source unless there's a compelling reason to do otherwise.

I'll have to dive in deeper, but that's something to bare in mind. If the changes in libs can be reduced down to the absolute minimum (ideally none) then it will be easier to assess the long term impact of such changes.

Comment thread rts/System/Platform/Mac/CpuTopology.cpp Outdated
ProcessorMasks GetProcessorMasks() {
ProcessorMasks masks;

unsigned int numCores = std::thread::hardware_concurrency();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this report every core, including the efficiency cores? This will be disastrous for performance.

Could you use something like sysctlbyname("hw.perflevel0.physicalcpu", ... to count the number of performance cores?

Also during threads pinning, rather than do nothing (because you can't pin them on Mac) you could mark the threads as QOS_CLASS_USER_INTERACTIVE - without having too many performance threads.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that modern Intel CPUs have efficiency cores, wouldn't it make sense to split this out into its own change rather than it being Arm/Mac specific?

@iamaperson000

Copy link
Copy Markdown
Author

Thanks for catching that, and you were right! Should be good now, please let me know if there is anything else. Oh and for the libs dir, I mean I guess we could technically commit upstream but I feel that it would take months, and am not sure if it is worth it.

Happy to do something else though. What are your thoughts?

iamaperson000 and others added 25 commits May 30, 2026 20:24
Adds the foundational macOS/Apple Silicon platform-support:

Platform-specific code (new files):
- rts/System/Platform/Mac/CpuTopology.cpp: stubs the CPU
  topology API on top of std::thread::hardware_concurrency()
  and sysctl for cache sizes (macOS exposes no portable
  per-core P/E topology).
- rts/System/Platform/Mac/ThreadSupport.cpp: native pthread
  wrappers (Suspend/Resume are no-ops on macOS).

Build system (CMake):
- if(APPLE)/NOT APPLE branches in rts/CMakeLists.txt,
  rts/builds/{legacy,dedicated}/CMakeLists.txt,
  rts/lib/glad/CMakeLists.txt, rts/System/CMakeLists.txt,
  test/CMakeLists.txt, and tools/unitsync/CMakeLists.txt so
  the Mac branch picks up the new Platform/Mac sources,
  pulls libunwind only on non-Apple UNIX, and links
  Foundation / objc / EGL (the latter via find_library with
  Homebrew/MESA_PREFIX hints).

Mac-gated source changes:
- Rendering/GlobalRendering.cpp: EGL-on-CAMetalLayer path
  (Kopper/Zink via Mesa) for context creation and
  SwapBuffers, all behind #if defined(__APPLE__).
- Game/LoadScreen.cpp, Rendering/GL/{myGL.cpp,glxHandler.*}:
  skip GLX (Mac has no X server) and skip the Lua intro
  screen (EGL/Metal incompatibility), all behind #ifdef.
- System/Platform/{ThreadAffinityGuard.cpp,.h}: stub the
  affinity API on macOS, which exposes no portable
  equivalent of sched_setaffinity.
- System/MemPoolTypes.h, Sim/Units/Unit.cpp: Apple-only
  fallbacks where pthread_t is an opaque pointer and where
  std::views::enumerate is unavailable in older Apple Clang
  libc++.
- test/other/testMutex.cpp: use os_unfair_lock instead of
  linux/futex.h on macOS.

Mac-driven but platform-neutral fixes:
- System/Platform/Threading.cpp: replace std::ranges::find_if
  with std::find_if (still C++17, compiles everywhere).
- System/SafeUtil.h: add missing <type_traits> include
  needed by libc++.
- lib/smmalloc/smmalloc.h: relax POD static_assert to
  is_trivially_copyable_v (avoids deprecated is_trivial).
- lib/smmalloc/smmalloc_generic.cpp, lib/assimp/include/
  assimp/{matrix3x3,matrix4x4,quaternion,vector2,vector3}.inl:
  add missing <cstdlib>/<cmath> includes (libstdc++
  transitively included them; libc++ does not).
- AI/Wrappers/CUtils/Util.c: const-correct the macOS branch
  of util_fileSelector to match Apple's scandir signature.

Linux and Windows code paths are either unchanged
(#ifdef-guarded) or pick up trivially-compatible standard
library calls; this commit is additive from their
perspective.
- FindSDL2.cmake: Add parent directory to include path so #include
  <SDL2/SDL.h> works with macOS SDL2 config (which sets include to
  /include/SDL2 directly)
- FindLibunwind.cmake: Use INTERFACE IMPORTED library target on macOS
  (fixes -framework treated as file path by Unix Makefiles generator)
- glad/CMakeLists.txt: Exclude glad_glx.c on Apple (no GLX on macOS)
- legacy/headless CMakeLists.txt: Suppress -no_warn_duplicate_libraries
  on macOS (benign transitive dependency duplicates)
- include/AL/: Vendored OpenAL EFX headers (macOS OpenAL.framework
  has no EFX support; needed for sound system compilation)
- rts/CMakeLists.txt: DevIL CMake module sets IL_INCLUDE_DIR to the
  directory containing il.h (e.g. /include/IL), but code uses
  #include <IL/il.h>. Add parent directory to include path.
- gladstub.cpp: Add missing GL function stubs needed for headless
  build (glMultiDrawArraysIndirect, glTexStorage1D, glDebugMessage*,
  GLAD_GL_ATI_meminfo, GLAD_GL_NVX_gpu_memory_info)
- FastMath.h: break circular dependency between math::sqrtf and
  streflop_cond.h's std::hypot workaround on __APPLE__ by providing
  a temporary declaration before the include
- Add Cpp23Compat.hpp: polyfill for std::views::enumerate (Apple
  libc++ doesn't support this C++23 feature yet), following the
  pattern of existing Cpp17Compat.hpp
- headless CMakeLists: re-link engineCommonLibraries after GameHeadless
  to fix macOS single-pass linker symbol resolution
- gflags: set GFLAGS_NAMESPACE to "google;gflags" since subdirectory
  builds default to "gflags" only but engine code uses "google"
smmalloc.h defines `#define INLINE inline` which leaks into
GLTFParser.cpp and conflicts with simdjson's layout_mode::INLINE
enum member, causing compilation failures whenever both headers are
included in the same translation unit (surfaces first on macOS where
the system simdjson header triggers this code path).
- assimp: Resolve ambiguous math function calls (fabsf, fabs, sqrt,
  etc.) that fail with Clang's stricter overload resolution. Use
  explicit std:: qualified calls and static_cast where needed.
- smmalloc: Add missing <type_traits> include and switch to <cstdlib>
  for C++ header consistency.
- float3.h: Clang template instantiation compatibility
- SafeUtil.h: Clang constexpr handling; use is_trivially_default_-
  constructible for portability
- MemPoolTypes.h: factor pthread_t/Win32/Linux thread-id logging into
  a helper so libc++ on Apple (where pthread_t is a pointer) formats
  cleanly
- Util.c: drop dead __APPLE__/non-__APPLE__ branch for util_fileSelector
  (both branches had identical const struct dirent* signatures)
- SolLua bind/*.cpp: sol::nil -> sol::lua_nil (sol3 compatibility with
  libc++ where sol::nil is unavailable on some configurations)
The projectile / explosion-FX texture atlas Finalize() can fail
(atlasAllocator Allocate() returns false), leaving atlasTex null.
CProjectileDrawer::Init then calls GetTexID() ->
GL::TextureBase::GetId() on null atlasTex, faulting at offset 0x8.

Guard GetTexID() / DisOwnTexture() to no-op on a null atlasTex so a
failed atlas degrades gracefully instead of crashing. This is a
defensive fix that helps any platform whose atlas allocation can fail.

Also: LuaTextures::Create now logs size/format/glError when
glTexImage fails (was a silent return-nil), aiding diagnosis.
LoadScreen exposes a runtime toggle for CLuaIntro under macOS via the
SPRING_MAC_ENABLE_LUAINTRO env var; the existing
#if defined(__APPLE__) block is preserved and the env-var check is
inside it.
SaveWindowPosAndSize was storing backing pixels (e.g. 2560x1440 on a
1280x720 logical Retina window), so the next launch restored a 2x-too-
big window that was then clamped to the screen, producing a portrait
sliver.

Store the logical size from SDL_GetWindowSize instead of the backing
size from SDL_GL_GetDrawableSize. Affects HiDPI Linux setups
symmetrically.
The engine checks for ARB_multitexture, ARB_texture_env_combine,
ARB_texture_compression, ARB_texture_float,
ARB_texture_non_power_of_two, and ARB_framebuffer_object extensions
by name. Per the GL spec, these were folded into core GL 1.3-3.0;
core-profile contexts no longer advertise them by name, but their
functionality is guaranteed.

The name-only check is a false-negative on any core-profile context.
Skip it when the active context is core profile so the engine doesn't
spuriously reject otherwise-valid configurations.
Replaces a hardcoded absolute libEGL.dylib path from an earlier
bring-up checkpoint with a SPRING_MAC_LIBEGL CMake cache variable
(default empty -> no Mesa link). Configure with:

  cmake -DSPRING_MAC_LIBEGL=/opt/homebrew/opt/mesa/lib/libEGL.dylib ...

When set, the engine also skips linking Apple's OpenGL.framework and uses
Mesa's libGL.dylib (looked up next to libEGL.dylib) instead. Loading
OpenGL.framework on macOS 26 registers an NSWindow notification observer
that bus-errors in +[NSOpenGLContext currentContext] during window setup.
Each EGL bring-up step now prints the result + last error to stderr.
Quickly identifies whether failure is in eglGetDisplay, eglInitialize,
eglBindAPI, eglChooseConfig, eglCreateContext, or eglMakeCurrent. On
Homebrew's stock Mesa on macOS 26 the failure is at eglInitialize
because that Mesa was built only for the X11 platform.
Engine-level changes to bring up the GL context on Apple Silicon
(macOS 26 / M-series) through a Mesa libEGL built for the
surfaceless platform, driving Zink (OpenGL-on-Vulkan) against the
KosmicKrisp Vulkan driver (Vulkan-on-Metal):

1. CMake: when SPRING_MAC_LIBEGL is set, link Mesa libGL only if a
   libGL.dylib sits next to it. libGL is not strictly required at
   link time — all GL entry points get resolved through
   eglGetProcAddress at runtime, so libEGL alone is enough.

2. EGL: switch eglChooseConfig from EGL_WINDOW_BIT to
   EGL_PBUFFER_BIT. Mesa's surfaceless EGL platform doesn't expose
   window-bit configs; presentation happens via CAMetalLayer +
   KosmicKrisp WSI (Vulkan -> Metal).

3. EGL: walk OpenGL versions 4.6 -> 3.2 calling eglCreateContext
   with CORE profile, take the first that succeeds. Mesa/Zink
   rejects both empty attribs (returns default GL 2.1 which the
   engine then rejects) and (3.0 + CORE) since profile attrs only
   apply to 3.2+.

4. EGL: dump renderer/version/vendor/GLSL strings right after
   eglMakeCurrent to make Zink-on-KosmicKrisp issues visible.

The matching change to skip the legacy ARB extension-name check
in CheckGLExtensions on a CORE-profile context was landed in an
earlier commit on this branch.

Required runtime env:
  EGL_PLATFORM=surfaceless
  MESA_LOADER_DRIVER_OVERRIDE=zink
  GALLIUM_DRIVER=zink
  MESA_GL_VERSION_OVERRIDE=4.6
  MESA_GLSL_VERSION_OVERRIDE=460
  VK_ICD_FILENAMES=<mesa-install>/share/vulkan/icd.d/kosmickrisp_mesa_icd.aarch64.json
  DYLD_LIBRARY_PATH=<mesa-install>/lib

Result: GL 4.6 (Core Profile) Mesa 26.2.0-devel, renderer 'zink
Vulkan 1.3(Apple M4 (MESA_KOSMICKRISP))', GLSL 4.60. GL4 mode
enabled. Clean engine startup + graceful shutdown.
The surfaceless Mesa EGL can't make a window surface, so the
engine renders into an off-screen pbuffer that eglSwapBuffers
never presents -> white window. Add a manual present: read the
rendered framebuffer back (glReadPixels, BGRA8) and blit it onto
the window's CAMetalLayer drawable via Metal, then present.

- rts/System/Platform/Mac/MetalPresent.{h,mm}: MRC-safe Metal
  helper. MacMetalPresent_Init(layer) sets up MTLDevice/queue and
  configures the CAMetalLayer (BGRA8, framebufferOnly=NO).
  MacMetalPresent_PresentBGRA() uploads a CPU BGRA buffer to a
  staging texture, blits it into nextDrawable, presents, commits.
  Optional vertical flip for GL bottom-up readback.
- System/CMakeLists.txt: build MetalPresent.mm in the Mac
  platform sources.
- builds/legacy/CMakeLists.txt: link Metal + QuartzCore
  frameworks on Apple.
- GlobalRendering.cpp: stash the CAMetalLayer (g_metalLayer);
  SPRING_MAC_PRESENT_TEST now drives the flash through this path.
  Confirmed: the window shows the rendered clear color (red)
  instead of staying white -- the present mechanism works end to
  end (GL/Zink -> KosmicKrisp -> glReadPixels -> Metal -> window).

Next: wire MacMetalPresent_PresentBGRA into
CGlobalRendering::SwapBuffers (with flipY) so real frames
present, and successively fix the load-time crashes
(CProjectileDrawer atlas, etc.) to reach the draw loop. The
glReadPixels roundtrip is a stopgap; IOSurface GL/Metal interop
is the perf follow-up.
- SwapBuffers: on the macOS/EGL path, read the rendered default
  framebuffer back (glReadPixels BGRA, flipY) and blit it to the
  CAMetalLayer via MacMetalPresent each frame, then
  SDL_PumpEvents() so CoreAnimation composites (we replaced
  SDL_GL_SwapWindow which used to service the run loop).
- InitEGLContext: size the pbuffer to the window's *backing*
  pixels (SDL_GetWindowSize points * backingScaleFactor) instead
  of logical points, so full Retina-resolution rendering isn't
  clipped. Init MacMetalPresent here.
- MetalPresent.mm: log nil drawables.
- Debug: SPRING_MAC_DUMP_FRAME=<prefix> dumps rendered frames to
  raw files (header w,h + BGRA) for offline inspection without
  screen capture.

Confirmed the present path is correct: a dumped load-time frame
is solid black because an earlier prototype disables the Lua
loading-screen renderer on macOS (CLoadScreen::Draw only draws
when luaIntro != nullptr; it's skipped here), so nothing is drawn
during load. Real imagery requires reaching CGame's draw loop
past the CProjectileDrawer atlas crash.
Replace the SwapBuffers path's full-frame CPU pixel copy + Y-flip
memcpy + MTLTexture replaceRegion upload with an IOSurface-backed
MTLTexture. The engine writes glReadPixels output directly into
the IOSurface's CPU view (honoring its rowBytes via
GL_PACK_ROW_LENGTH), and a one-triangle Metal render pass samples
the same surface and Y-flips into the drawable.

Adds:
- MacMetalPresent_AcquireIOSurfaceBuffer(w, h, &rowBytes) returns
  a locked CPU pointer also bound as an MTLTexture; recreates the
  backing only when dimensions change.
- MacMetalPresent_PresentIOSurface(flipY) unlocks the surface,
  encodes a cached render pipeline state (flip / non-flip), and
  presents the drawable.
- IOSurface.framework added to the legacy build's link list.

The original MacMetalPresent_PresentBGRA is kept for the
early-splash callsite, and as a runtime fallback selectable via
SPRING_MAC_LEGACY_PRESENT=1.

Notes:
- Apple Silicon's IOSurface picks 64-pixel row alignment, so at
  width 2940 rowBytes is 11776 (= 2944 pixels per row, 4 pixels
  of padding). Honor it via GL_PACK_ROW_LENGTH or the readback
  tears.
- Logs '[MetalPresent] IOSurface zero-copy path active (WxH,
  rowBytes=N)' once on first acquire, and a corresponding
  'LEGACY CPU-staging path active' line if the fallback is
  taken.
ReadWindowPosAndSize bound winSize / viewSize to GetMetalDrawableSize()
on the macOS surfaceless path, but the engine renders into a backing-
resolution pbuffer FBO. They were equal by accident at full Retina
(drawable == backing == pbuffer size); with non-1x render scales they
diverge -> glViewport(0,0,drawableW,drawableH) on a smaller FBO meant
only one quadrant of geometry landed.

Bind to the FBO size instead. This is the latent-bug shape; HiDPI Linux
setups that ever render into a smaller FBO than the drawable would hit
the same issue. No behavior change in the default same-size case.
Mesa 26.2 Zink grants a 4.6 compatibility-profile context on Apple
Silicon via KosmicKrisp (verified). The EGL init now prefers
compatibility (version walk 4.6 -> 3.2) and falls back to core only if
compat is refused. Set SPRING_MAC_GL_CORE=1 to force core.

Compatibility profile is a strict superset of core: every modern GL4
feature is available AND legacy paths (immediate mode, display lists,
the fixed-function matrix stack, '#version ... compatibility' GLSL)
keep working. Several Lua-built shaders in BAR rely on those legacy
paths, so the compat profile is the easier integration point on the
macOS path.

Geometry shaders remain unavailable regardless of profile because
Vulkan reports geometryShader=false on Metal; that is a separate
problem and not affected by this change.
Adds a glReadPixels-based frame dump hook in SwapBuffers, gated by the
SPRING_FRAME_CAPTURE env var. When set, the engine writes the default
FBO contents to <prefix>.<frame>.raw before each present (use
raw2png.py to convert).

Useful for verifying headless rendering output without needing a
window-system. The hook is no-op when the env var is unset, so the
default behavior on every platform is unchanged.
Three env-gated features useful on any GL backend that uses
glReadPixels for present:

- SPRING_NO_PBO=1: disable double-buffered async PBO readback
  (default is ON; PBO hides the glReadPixels GPU pipeline drain
  behind 1 frame of present latency)
- SPRING_DOWNSAMPLE_READBACK=N: blit-downsample by N before readback
- SPRING_TIME_PRESENT=1: per-60-frame stage timing breakdown

Also adds SPRING_MAC_NO_RETINA=1 to render the pbuffer at logical
(1x) resolution instead of full backing (Retina) size — Apple-Silicon
specific perf knob, no effect on other platforms.

PBO async readback default-on gave ~3x FPS in busy scenes on the
macOS Zink+KosmicKrisp path (sync 41 ms busy -> PBO 13 ms steady).
Behavior is unchanged on platforms that don't engage the readback
present path.
CLuaIntro was previously disabled on macOS as a workaround for the
core-profile shader path. With the compat-profile context now the
default (see earlier commit on this branch), the loading-screen text /
splash / progress works correctly via the engine font renderer.

Flip the macOS default to ENABLED, and switch the env-var escape hatch
to opt-OUT: set SPRING_MAC_DISABLE_LUAINTRO=1 to skip CLuaIntro and
fall back to the simple black load screen. Non-macOS platforms are
unchanged (CLuaIntro has always been on by default there).
Metal (via Mesa Zink / KosmicKrisp on Apple Silicon) has no geometry-
shader stage; Vulkan reports geometryShader = false on that path. But
Mesa advertises GL_MAX_GEOMETRY_OUTPUT_VERTICES > 0 regardless, so the
engine cannot detect the missing capability via GL introspection.

Strip GS from Lua-loaded shader programs on macOS so the program at
least links. Non-macOS platforms continue to honor the shader author's
intent -- Linux / Windows GL drivers support geometry shaders, and any
custom Lua shader using GS would have been silently broken on those
platforms by the previous unconditional strip.

Widgets that need GS-style point expansion have a Lua-layer NoGS
fallback in the BAR widget tree (separate PR), so the engine-level
strip is a fail-safe rather than the primary mechanism.
SDL emits mouse events in logical (point) coordinates. On the macOS
surfaceless+pbuffer path the engine viewport is in backing-pixel
coordinates (winSize / viewSize are tied to the pbuffer FBO; see
GlobalRendering::ReadWindowPosAndSize). Without rescaling, windowed-
mode clicks land at half the cursor position on Retina displays.

Add two static helpers (ScaleMouseCoords, ScaleMouseDelta) gated by
#if defined(__APPLE__), and route MOUSEMOTION / MOUSEBUTTONDOWN /
MOUSEBUTTONUP through them. Non-macOS platforms are untouched - the
#else branch matches the prior behavior exactly.
LuaParser.cpp:127 and LuaHandleSynced.cpp:435 call
LuaLibs::OpenSynced, which is only defined in Lua/LuaLibs.cpp.
A previous cherry-pick dropped that file from both the
dedicated server and unitsync source lists, leaving the bare
declaration in LuaLibs.h to satisfy compilation while breaking
the link on every platform.

Re-add ${ENGINE_SRC_ROOT_DIR}/Lua/LuaLibs.cpp to both targets
so OpenSynced is actually linked in.
The glGetIntegerv(GL_MAX_GEOMETRY_OUTPUT_VERTICES) probe and the
first L_WARNING log sat OUTSIDE the macOS-only #if, so any
Linux/Windows Lua shader carrying a geometry stage would emit a
spurious warning every compile and pay for an unnecessary GL
query.

Move both inside the __APPLE__ block alongside the existing
"GS unconditionally stripped on macOS" log and geomSrcs.clear()
call. Non-Apple builds now ignore non-empty geomSrcs as before.
- FindSDL2.cmake: the SDL2::SDL2 INTERFACE_INCLUDE_DIRECTORIES
  rewrite was added for Homebrew SDL2 (which sets the include to
  .../include/SDL2 only). On Linux distros sdl2-config already
  produces a usable include layout, and rewriting it would leak
  /usr/include into every SDL2-consuming target. Gate the elseif
  branch on APPLE.
- builds/legacy/CMakeLists.txt: replace a U+2014 em-dash in a
  comment with ASCII -- so the source stays plain-ASCII.
- lib/CMakeLists.txt: keep GFLAGS_NAMESPACE="google;gflags" but
  fix the rationale comment. Engine code uses gflags::, however
  Homebrew's /opt/homebrew/include/gflags/gflags.h is picked up
  first (DevIL etc. add -I/opt/homebrew/include) and that header
  hard-codes GFLAGS_NAMESPACE=google, so the DEFINE_* macros in
  main.cpp emit google::FlagRegisterer references. Building the
  vendored gflags with both namespaces resolves either spelling.
macOS's OpenAL.framework lacks the EFX extension headers, so we
vendor efx.h and alext.h. Previously these lived in include/AL/,
which is added to every target's include path via
include_directories(\${CMAKE_SOURCE_DIR}/include/AL). On Linux
that would shadow the system OpenAL-Soft devel headers exposed
through the OpenAL::OpenAL CMake target.

Move them to include/Mac/AL/ and add that path only inside the
Sound CMakeLists' if(APPLE) branch. Linux builds keep using the
system AL headers; macOS still finds <efx.h> and <alext.h>.
Addresses beyond-all-reason#2991 review feedback (thanks @lostsquirrel1).

Apple Silicon cores are heterogeneous: a small high-performance (P)
cluster and a larger efficiency (E) cluster. Treating every visible core
as a P-core (the previous behavior) caused the engine to over-provision
sim worker threads, some of which then landed on E-cores at ~1/3 the
throughput of P-cores.

Two changes:

1. CpuTopology::GetProcessorMasks now reads the per-perflevel sysctl
   keys (hw.perflevel0.physicalcpu, hw.perflevel1.physicalcpu) to count
   P-cores and E-cores separately, and reports them in the appropriate
   masks. Intel Macs and older kernels do not expose perflevel keys,
   so behavior on those targets is unchanged (all cores treated as P).

   On an Apple M4 (4 P + 6 E) the new masks read:
     Performance Core Mask:      0x0000000f
     Efficiency  Core Mask:      0x000003f0
   and Optimal thread count drops from 9 to 4, matching the P-cluster.

2. ThreadSupport::SetupCurrentThreadControls now calls
   pthread_set_qos_class_self_np(QOS_CLASS_USER_INTERACTIVE, 0) so the
   kernel preferentially schedules these threads on the P-cluster. The
   call is gated to threads that pass through ThreadStart with a
   ThreadControls handle (the sim workers), not every pthread in the
   process, so background I/O / helper threads remain free to land on
   the E-cluster.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants