feat(vm): add openshell-vm microVM gateway backend (opt-in via NEMOCLAW_GATEWAY_BACKEND=vm) by ericksoa · Pull Request #1791 · NVIDIA/NemoClaw

ericksoa · 2026-04-11T21:07:03Z

Summary

Adds openshell-vm (libkrun microVM) as an opt-in alternative gateway backend. Docker remains the default. Users enable the VM backend by setting NEMOCLAW_GATEWAY_BACKEND=vm — no documented flags or CLI changes.

Also bumps the OpenShell version pin from 0.0.26 to 0.0.32, picking up seccomp hardening, Landlock fixes, SSRF protection, deny rules in the policy schema, and standalone binary publishing.

All VM backend code is exercised nightly by the vm-e2e CI job (30/30 passing on ubuntu-latest with KVM). The Docker path is unchanged and all existing E2E jobs continue to pass.

How it works

detectGatewayBackend() checks, in order:

NEMOCLAW_GATEWAY_BACKEND env var — "vm" or "docker" overrides everything
Docker available → use Docker (default)
openshell-vm binary in PATH → use VM
Neither → "unknown" (onboard fails with guidance)

When VM is selected, onboard.ts spawns openshell-vm --name nemoclaw --mem 4096 as a detached process, tracks its PID, polls gRPC health, and manages the full lifecycle (start, health check, resume, cleanup). The sandbox image is built with docker build on the host, exported via docker save, written into the VM rootfs via virtio-fs, and imported into the VM's containerd via ctr images import.

Benefits of the VM backend

	Docker (default)	VM (`NEMOCLAW_GATEWAY_BACKEND=vm`)
Isolation	Sandboxes share the host kernel via Docker containers	Sandboxes run inside a hardware-isolated microVM (KVM). No shared kernel, no Docker socket
Dependencies	Requires Docker daemon (or Podman with Docker compat)	Requires only `/dev/kvm` and the `openshell-vm` binary. No Docker daemon for the gateway
Memory	~8 GiB for Docker daemon sidecar + k3s	~4 GiB for the microVM (configurable via `--mem`)
Startup	~60s (Docker daemon + k3s bootstrap)	~6s VM boot + ~60s k3s bootstrap inside VM
K8s deployment	Requires Docker-in-Docker (DinD sidecar, 3 volumes, init container)	Single container with KVM access, no sidecar
GPU inference	Works via host inference server + L7 proxy	Same — GPU inference is routed through `inference.local` (OpenShell L7 proxy → host), so the gateway never touches the GPU directly
macOS	Docker Desktop or Colima	Hypervisor.framework (no Docker needed) — not yet tested with this integration

Tradeoffs and limitations

KVM required: The VM backend needs /dev/kvm, which means bare-metal or nested virtualization. Not available in all cloud VMs or container runtimes.
Host Docker still needed for sandbox image build: docker build + docker save run on the host to create the sandbox image. This is standard CI usage, not Docker-in-Docker, but it means Docker isn't fully eliminated.
Two rootfs workarounds (both self-disabling): see "Workarounds" section below.
macOS untested: openshell-vm supports macOS ARM64 via Hypervisor.framework but this NemoClaw integration hasn't been tested on Mac yet.
Not the default: Docker is the established, documented, tested-in-production path. The VM backend is new and should bake before becoming a default.

OpenShell version bump: 0.0.26 → 0.0.32

This PR bumps max_openshell_version and the install pin from 0.0.26 to 0.0.32. Key improvements in the new range:

Version	What it brings
v0.0.28	VM kernel source fix for `CONFIG_POSIX_MQUEUE` (d8cf7951) — runtime not yet rebuilt, shim still needed
v0.0.29	Seccomp hardening, Landlock path fixes, symlink resolution
v0.0.30	ComputeDriver refactor, deny rules in network policy schema, streaming fix
v0.0.31	Exclude vm-dev tag from version glob, header allowlist for inference
v0.0.32	Standalone `openshell-sandbox` binaries published, system CA cert support

Workarounds (both self-disabling)

1. mqueue runc shim (still needed)

The vm-dev kernel runtime artifacts have not been rebuilt since CONFIG_POSIX_MQUEUE=y was added to the source (d8cf7951, 2026-04-10). Without mqueue support, every container inside the VM fails with error mounting "mqueue" to rootfs: no such device.

Fix: Write a containerd config that routes runc through a shim script. The shim edits each container's config.json to use tmpfs instead of mqueue.

Self-disables: The init script tests mount -t mqueue at boot. If the kernel supports it, the shim is never installed. Once the vm-dev runtime is rebuilt with the kernel fix, this becomes a no-op.

2. Supervisor glibc extraction (still needed)

The openshell-sandbox supervisor binary is built against glibc 2.39 (Ubuntu 24.04) but sandbox containers use Ubuntu 22.04 (glibc 2.35). It crashes with GLIBC_2.38 not found.

Fix: Extract the compatible binary from the Docker gateway image at onboard time (~7s, once).

Self-disables: Only runs if Docker is available. Overwrites with an equivalent binary if upstream fixes the build target.

What changed (17 files, ~+1500/-40)

File	Purpose
`src/lib/platform.ts`	`detectGatewayBackend()` — Docker preferred, VM via env override
`src/lib/onboard.ts`	VM gateway lifecycle: spawn, PID tracking, health poll, rootfs patches (mqueue shim + glibc), image import, resume/recovery, DNS proxy skip for VM
`src/lib/openshell.ts`	`isOpenshellVmAvailable()`, `getInstalledOpenshellVmVersion()`
`src/lib/onboard-session.ts`	`gatewayBackend` field in session (persists choice across resume)
`src/nemoclaw.ts`	VM-aware cleanup (`stopVmGateway`), `pruneKnownHostsEntries` import
`nemoclaw-blueprint/blueprint.yaml`	`gateway_backends: [docker, vm]`, max version bump to 0.0.32
`schemas/blueprint.schema.json`	Schema update for `gateway_backends`
`scripts/install-openshell.sh`	Version pin bump to 0.0.32
`scripts/brev-launchable-ci-cpu.sh`	Default OpenShell version bump to v0.0.32
`.github/workflows/nightly-e2e.yaml`	`vm-e2e` job (KVM, `NEMOCLAW_GATEWAY_BACKEND=vm`, diagnostics)
`test/e2e/test-vm-backend-e2e.sh`	Full VM E2E: install → onboard → inference → resume → reset (30 checks)
`test/openshell-vm.test.ts`	Unit tests for VM detection, session persistence, lifecycle
`test/platform.test.ts`	Unit tests for `detectGatewayBackend()` priority and overrides

E2E test phases (30/30 passing)

Phase	Description
0	Prerequisites (Linux, KVM, API key, network)
1	Install openshell-vm binary + runtime
2	Install NemoClaw with `NEMOCLAW_GATEWAY_BACKEND=vm`
3	Post-install verification (registry, list, status, no Docker container)
4	Live inference through VM sandbox (PONG test)
5	Resume after openshell-vm kill
6	Reset — destroy and clean slate re-onboard
7	Final cleanup

Test plan

Refs: NVIDIA/OpenShell#611

OpenShell v0.0.26 ships openshell-vm, a standalone binary that boots a hardware-isolated microVM via libkrun. This commit lays the groundwork for NemoClaw to support both the existing Docker/k3s gateway and the new microVM gateway as selectable backends. Phase 1 changes: - Bump min_openshell_version from 0.0.24 to 0.0.26 across blueprint, install script, onboard preflight, CI scripts, E2E tests, and docs - Add gateway_backends field to blueprint.yaml schema (docker, vm) - Add isOpenshellVmAvailable() and getInstalledOpenshellVmVersion() to openshell.ts for detecting the openshell-vm binary - Add detectGatewayBackend() to platform.ts with NEMOCLAW_GATEWAY_BACKEND env var override, auto-detection preferring VM when available and falling back to Docker, and mandatory Docker for GPU workloads - Add gatewayBackend field to onboard session schema for persisting the selected backend across resume cycles - Add tests for all new functions The VM backend requires no Docker daemon and provides faster boot, but has no NVIDIA GPU passthrough (libkrun lacks PCI/VFIO support), so the Docker backend remains mandatory for local inference on GPU workstations. Refs: NVIDIA/OpenShell#611 Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

coderabbitai · 2026-04-11T21:07:10Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 63c73205-7275-4409-a35b-c88b4c732961

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 27.91% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding openshell-vm as a microVM gateway backend with environment variable opt-in, which aligns with the comprehensive set of changes across configuration, code, tests, and workflows.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/openshell-vm-backend

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-11T21:07:47Z

🚀 Docs preview ready!

https://NVIDIA.github.io/NemoClaw/pr-preview/pr-1791/

Phase 4 of the dual-backend feature (PR #1791): - New test/e2e/test-vm-backend-e2e.sh: full E2E journey for the VM backend — install openshell-vm from release assets, onboard with NEMOCLAW_GATEWAY_BACKEND=vm, verify sandbox creation, live inference through the microVM gateway, resume after openshell-vm kill, and reset to clean slate. - New vm-e2e job in nightly-e2e.yaml: runs on ubuntu-latest (has /dev/kvm), installs openshell-vm, executes the VM backend E2E test. - New vm-backend test suite in brev-e2e.test.ts: allows running the VM backend E2E on ephemeral Brev instances via TEST_SUITE=vm-backend. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

The openshell-vm binary is published on the vm-dev pre-release tag (not v0.0.26) with gnu libc (not musl). Also downloads the VM runtime (kernel + rootfs) needed by libkrun, and installs zstd for decompression. Asset corrections: - Tag: vm-dev (not v0.0.26) - Binary: openshell-vm-*-unknown-linux-gnu.tar.gz (not musl) - Checksums: vm-binary-checksums-sha256.txt - Runtime: vm-runtime-linux-*.tar.zst → ~/.local/share/openshell-vm/ Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

When detectGatewayBackend() returns "vm", the onboard and runtime recovery flows now use openshell-vm instead of Docker: - startGatewayWithOptions() detects the backend and delegates to startVmGatewayProcess() for the VM path, which spawns openshell-vm as a detached background process with PID tracking - VM lifecycle helpers: writeVmPidFile, readVmPid, isVmProcessAlive, stopVmGateway (SIGTERM→SIGKILL), isVmGatewayHealthy - destroyGateway() checks session.gatewayBackend and stops the VM process instead of Docker volume cleanup when backend is "vm" - recoverGatewayRuntime() reads session.gatewayBackend to choose VM vs Docker recovery path - recoverNamedGatewayRuntime() in nemoclaw.ts skips Docker-specific gateway select commands for VM backend - cleanupGatewayAfterLastSandbox() stops VM process instead of Docker cleanup when backend is "vm" - Gateway backend is saved to onboard session on step completion so resume cycles know which backend to use - Resume flow checks VM health via isVmGatewayHealthy() instead of Docker gateway state when session records VM backend Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

Reading the openshell-vm source (NVIDIA/OpenShell crates/openshell-vm) revealed three bugs in the Phase 2 implementation: 1. openshell-vm was spawned with no arguments. It needs `--name nemoclaw` so it extracts the rootfs to the correct instance directory and registers gateway metadata under the right identity. 2. openshell-vm prefixes instance names: gateway_name("nemoclaw") produces "openshell-vm-nemoclaw". All OPENSHELL_GATEWAY env vars and `openshell gateway select` calls for the VM path now use VM_GATEWAY_NAME ("openshell-vm-nemoclaw") instead of GATEWAY_NAME. 3. Health poll was too short (15 × 2s = 30s). The VM boots k3s inside a microVM with its own 90s internal health check. Increased to 60 × 3s = 180s to avoid racing the inner bootstrap. Also log the last 10 lines from openshell-vm.log on failure for diagnostics. Additionally: the VM gateway listens on port 30051 (NodePort), not 8080. The openshell-vm binary handles the port mapping internally (gvproxy host:30051 → VM:30051 → kube-proxy → pod:8080). Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

The GitHub-hosted ubuntu runner has /dev/kvm (crw-rw---- root:kvm) but the runner user is not in the kvm group. openshell-vm opens /dev/kvm directly via libkrun and fails with EACCES. Fix by chmod 666 /dev/kvm in the KVM verification step. Also add the user to the kvm group for completeness, though the chmod is sufficient for the current process. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

…tics Three changes to address gvproxy crash on GitHub Actions: 1. Pass --mem 4096 to openshell-vm. The default 8GB is half the 16GB runner memory. With k3s pulling container images inside the VM, host processes (gvproxy, gvisor netstack) get starved. 4GB is enough for a lightweight gateway without GPU workloads. 2. Detect and use the E2E-downloaded VM runtime via OPENSHELL_VM_RUNTIME_DIR. The test script downloads gvproxy/libkrun/libkrunfw from the vm-dev release to ~/.local/share/openshell-vm/ but never tells openshell-vm to use it. The downloaded runtime may contain fixes not in the binary's embedded copy. When the downloaded runtime exists (has gvproxy), set OPENSHELL_VM_RUNTIME_DIR to prefer it. 3. Add VM diagnostics step to CI: openshell-vm log, gvproxy log, dmesg OOM check, memory stats, and VM console log. This will show the actual root cause if gvproxy crashes again. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

The openshell-vm kernel has CONFIG_POSIX_MQUEUE=y but the init script (openshell-vm-init.sh) never mounts the mqueue filesystem. When k3s creates a pod, runc tries to mount mqueue at /dev/mqueue inside the container namespace and gets ENODEV ("no such device") because the host mount point doesn't exist. Fix: run `openshell-vm prepare-rootfs` to extract the rootfs, then patch the init script to mkdir + mount mqueue alongside the existing devpts/shm mounts. The patch is idempotent — skipped if the init script already contains /dev/mqueue. Root cause found by tracing the VM console log: runc create failed: error mounting "mqueue" to rootfs at "/dev/mqueue": no such device This should be fixed upstream in the OpenShell init script. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>

The previous mqueue patch ran silently when prepare-rootfs failed or the init script wasn't found. Add verbose logging at each step so CI output shows exactly what happened: rootfs path, init script location, whether the string replacement matched, and any errors. Signed-off-by: Aaron Erickson <aerickson@nvidia.com>