You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support BOTH a containerised cluster worker and a native/bare-metal worker, not one or the other. Containerised is the recommended default; native is an opt-in for dedicated nodes. Relates to #890 (worker auto-update).
Why containerised as the default (esp. GPU workers)
First-hand evidence while standing up the RTX 3060 SD backend: native CUDA on bleeding-edge Fedora 43 (gcc 15 / glibc 2.42) would not build (CUDA 12.9 nvcc rejects gcc>14; gcc-14 then hit a glibc mathcalls.h incompatibility). The container path (nvidia-container-toolkit + docker run --gpus all) bundled a compatible toolchain and saw the GPU immediately.
Makes "install CUDA if missing" tractable: ensure driver + container toolkit, run image, instead of maintaining a per-distro driver/CUDA/kernel-module matrix.
Universal management: one update path, reproducible; multiple backends (sd.cpp, ComfyUI, ollama) coexist as separate containers without polluting the host.
Why keep a native/bare-metal option
Dedicated homelab/enthusiast nodes want max performance, minimal footprint, no container runtime.
Locked-down or older hosts where Docker/podman is unavailable or GPU passthrough is finicky.
Simpler mental model for a single-purpose box.
Design: one worker, two deployment adapters
Define the worker as a transport-agnostic capability contract (register with controller, advertise backends/capabilities, drain protocol, apply_update(), health-check). The same worker-agent logic runs whether the backend process is a native binary or a container. "Container vs bare-metal" is a deployment adapter under a single apply_update() abstraction (image pull vs package/binary update). Avoids two codebases.
Installer behaviour
Detect the host and recommend:
Has, or can install, a container runtime -> containerised (default). For GPU: ensure driver + nvidia-container-toolkit, pull the backend image.
User picks "dedicate this machine / minimal install" -> native path; installer best-effort sets up driver/CUDA per-distro, with the container path as the fallback when native GPU setup is too fragile.
Cross-platform per the worker scripts policy (bash + powershell; Linux/macOS/Windows).
Acceptance
A new GPU worker can be brought up either way from the installer, both register and serve the same capability contract, and #890's update/drain/rollback works identically across both adapters. Brainstorm the contract + adapter boundary before building; do not interrupt the current storybook/image-gen line.
Decision (Jay 2026-06-14)
Support BOTH a containerised cluster worker and a native/bare-metal worker, not one or the other. Containerised is the recommended default; native is an opt-in for dedicated nodes. Relates to #890 (worker auto-update).
Why containerised as the default (esp. GPU workers)
First-hand evidence while standing up the RTX 3060 SD backend: native CUDA on bleeding-edge Fedora 43 (gcc 15 / glibc 2.42) would not build (CUDA 12.9 nvcc rejects gcc>14; gcc-14 then hit a glibc mathcalls.h incompatibility). The container path (
nvidia-container-toolkit+docker run --gpus all) bundled a compatible toolchain and saw the GPU immediately.Why keep a native/bare-metal option
Design: one worker, two deployment adapters
Define the worker as a transport-agnostic capability contract (register with controller, advertise backends/capabilities, drain protocol,
apply_update(), health-check). The same worker-agent logic runs whether the backend process is a native binary or a container. "Container vs bare-metal" is a deployment adapter under a singleapply_update()abstraction (image pull vs package/binary update). Avoids two codebases.Installer behaviour
Detect the host and recommend:
Acceptance
A new GPU worker can be brought up either way from the installer, both register and serve the same capability contract, and #890's update/drain/rollback works identically across both adapters. Brainstorm the contract + adapter boundary before building; do not interrupt the current storybook/image-gen line.