Skip to content

agoodkind/lmd

Repository files navigation

lmd

A single-binary LM Studio replacement for Apple Silicon.

lmd owns every part of the local-LLM workstation experience:

  • broker on localhost:5400 exposes an OpenAI-compatible HTTP API over any MLX model on disk
  • JIT model routing spawns a dedicated SwiftLM child per model, allocates ports from a pool, shuts them down under memory pressure
  • sensor sampling to memory.jsonl for thermal, battery, and power time-series
  • fan control is disabled in lmd-serve during the current moratorium; macOS owns fans while the broker runs
  • multi-tab TUI (monitor, library, bench, events) rendered in raw terminal mode
  • benchmark orchestrator for long-running model comparison jobs

One subsystem for unified logs (io.goodkind.lmd). One daemon (lmd-serve). One interactive tool (lmd-tui).

Install

make install

This:

  1. Builds release binaries via SwiftPM (swift build -c release) and the MLX Metal shader library (default.metallib) via Tuist + xcodebuild. Both halves are required; see Tools/lmd-dev.swift for the rationale.
  2. Copies the binaries and mlx-swift_Cmlx.bundle to ~/.local/bin/ (override with PREFIX=/opt/...).
  3. Writes ~/Library/LaunchAgents/io.goodkind.lmd.serve.plist from the template with your install path substituted in.
  4. launchctl bootstraps the agent into the current GUI session.

The broker starts running immediately and at every subsequent login. Requires Xcode (for xcodebuild + tuist) and a SwiftPM toolchain matching Package.swift's swift-tools-version.

Binaries

Binary Role Lifecycle
lmd Dispatcher. lmd serve, lmd tui, lmd bench, lmd qa execs the right sibling. Short-lived (the user runs it).
lmd-serve Broker + sensor sampler. Fan control is disabled during the current moratorium. 24/7 LaunchAgent.
lmd-tui Interactive dashboard (monitor / library / bench / events tabs). Foreground while the user wants it open.
lmd-bench Benchmark orchestrator. Long runs that survive terminal close. Foreground or detached via nohup.
lmd-qa TUI QA harness for CI (three drivers: tmux, pty, iTerm). CI only.

The broker on 5400 speaks the OpenAI API. Point Cursor, humanify, or anything else at http://localhost:5400 and it just works.

Video

POST /v1/chat/completions may include OpenAI-style video_url content parts for models whose catalog capabilities advertise video: true.

lmd stays dumb here on purpose. It validates that video_url.url points to a local file and routes that file through to the MLX VLM backend. lmd does not decode, retime, or expand the video itself. Temporal sampling is backend-owned.

With the current Swift Qwen video processors in upstream mlx-swift-lm, that backend-owned policy is 2 FPS, so video support is honest routing support, not high-fidelity subtle-animation analysis.

Environment

Defaults live in deploy/io.goodkind.lmd.serve.plist.example. All lmd-serve environment variables:

Var Default Meaning
LMD_HOST localhost Broker bind host.
LMD_PORT 5400 Broker bind port.
LMD_BUDGET_GB 80 Max GB of models resident at once. Evictions happen above this.
LMD_IDLE_MINUTES 15 After this many minutes idle, unload a chat (SwiftLM) model.
LMD_EMBEDDING_IDLE_MINUTES 60 Idle timeout for in-process embedding backends (often longer than chat).
LMD_SAMPLE_INTERVAL 15 Seconds between sensor samples.
LMD_DATA_DIR ~/Library/Application Support/io.goodkind.lmd Where memory.jsonl lands.
LMD_SWIFTLM_BINARY ~/Sites/SwiftLM/.build/arm64-apple-macosx/release/SwiftLM SwiftLM inference engine to spawn.

The client-side dispatcher also reads LMD_HOST and LMD_PORT for lmd status, lmd load, lmd unload, etc.

Embeddings

POST /v1/embeddings accepts an OpenAI-shaped body (model, input as a string or array of strings, optional encoding_format, must not set stream).

Models are classified as chat or embedding when the catalog scans disk: sentence_bert_config.json or modules.json, config.json architectures (BERT family, Snowflake Arctic Embed, and similar), model_type hints, plus name patterns such as embed or bge. See ModelCatalog.inferModelKind in SwiftLMRuntime.

GET /v1/models and GET /swiftlmd/loaded include a kind field per entry (chat or embedding). Chat requests against an embedding id return HTTP 400.

Embedding inference uses backend families in process (SwiftLMEmbed, weights from the same directories as chat models). MLX-compatible embedder metadata routes to MLXEmbedders. NVIDIA Mistral bidirectional SentenceTransformers metadata, including models such as nvidia/NV-EmbedCode-7b-v1, routes to the native NVIDIA embedding backend.

Smoke test from the dispatcher: lmd embed -h then lmd embed -m <id> -t "hello".

Observability

Everything structured flows through os.Logger under subsystem io.goodkind.lmd:

# Live tail.
log stream --subsystem io.goodkind.lmd --info

# Last hour with category filter.
log show --predicate 'subsystem == "io.goodkind.lmd" AND category == "Broker"' --last 1h

# NDJSON for parsing.
log show --subsystem io.goodkind.lmd --last 30m --style ndjson

Data artifacts (memory.jsonl, bench results/*.json) live under LMD_DATA_DIR and are separate from logs. The Apple-native logging policy itself is enforced by make log-audit and codified in AGENTS.md §5.

Develop

SwiftPM pulls macos-smc-fan from https://github.com/agoodkind/macos-smc-fan.git on branch main. A normal clone of this repo plus tuist on PATH (brew install tuist) is enough for make build.

make build              # hybrid SwiftPM (binaries) + xcodebuild (metallib)
make debug              # SwiftPM debug build only (no metallib refresh)
make test               # unit + snapshot + integration tests
make tui-qa             # interactive TUI QA: tmux + pty + iTerm drivers
make log-audit          # enforce the Apple-native logging policy
make run-tui            # launch the TUI in foreground
make run-serve          # run the broker in foreground (bypasses launchd)
make restart-serve      # pick up a new broker binary under launchd
make uninstall          # remove binaries + LaunchAgent

Every Make target is a thin alias over Tools/lmd-dev.swift. To skip Make and call it directly: swift Tools/lmd-dev.swift help.

Layout

lmd/
  Package.swift          SwiftPM package (executables + library targets)
  Project.swift          Tuist Xcode project (used only to compile default.metallib)
  Tuist.swift            Tuist configuration shim
  Tuist/                 Tuist's own SwiftPM resolution for project generation
  Tools/
    lmd-dev.swift        Swift-script driver behind every Make target
  Sources/
    AppLogger/           shared os.Logger + swift-log bridge
    SwiftLMCore/         model descriptors, shared types
    SwiftLMBackend/      SwiftLM child-process lifecycle + MLX VLM video backend
    SwiftLMEmbed/        embedding backend families (MLXEmbedders + native NVIDIA)
    SwiftLMRuntime/      router, bench config + orchestrator, fan policy library, event bus
    SwiftLMMonitor/      macmon client, sensor sampler, battery reader
    SwiftLMControl/      XPC broker client + protocol
    SwiftLMTUI/          tab protocol, panels, ANSI + input parsers
    LMDServeSupport/     HTTP routing helpers shared between lmd-serve and tests
    lmd/                 dispatcher (lmd <subcommand>)
    lmd-serve/           broker + sampler daemon
    lmd-tui/             interactive dashboard
    lmd-bench/           benchmark runner
    lmd-qa/              three-driver TUI QA harness
  Tests/
    SwiftLMTUITests/     tab render snapshots
    SwiftLMRuntimeTests/ bench, router, fan logic, model catalog capabilities
    SwiftLMBackendTests/ SwiftLM server config, MLX VLM video backend
    SwiftLMCoreTests/    model capabilities
    SwiftLMControlTests/ broker protocol
    LMDServeTests/       video chat routing
    IntegrationTests/    binary launch + SIGINT, embeddings route
    Fixtures/            shared inputs (log categories, tuiqa coverage)
  deploy/
    io.goodkind.lmd.serve.plist.example   LaunchAgent template
    homebrew/                              brew formula
  plan/
    VIDEO_ROUTING_FINAL_DECISION.md        boundary for video request routing

Related projects

  • SwiftLM upstream MLX inference engine; lmd-serve spawns one child per loaded model.
  • macos-smc-fan Swift package linked by the fan policy library. lmd-serve does not currently take over fans.
  • fancurveagent the LaunchAgent that owns fans independently of lmd-serve during the current moratorium.

About

macOS XPC-based LM Studio broker and companion CLIs

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages