Skip to content

Latest commit

 

History

History
5206 lines (4530 loc) · 235 KB

File metadata and controls

5206 lines (4530 loc) · 235 KB

UTF Backlog

This is the single, consolidated backlog for the Unified Temporal Fabric. It replaces the per-subsystem backlogs that previously lived at drawfs/drawfs-BACKLOG.md, semadraw/semadraw-BACKLOG.md, semaaud/semaaud-BACKLOG.md, semainput/semainput-BACKLOG.md, chronofs/chronofs-BACKLOG.md, and shared/shared-BACKLOG.md.

Those files remain as short pointers to this one, so existing links and references continue to resolve, but they are no longer the source of truth for tasks. This file is.

Split structure (2026-05-27 evening). Closed and superseded entries have been moved to BACKLOG-history.md to keep this file focused on outstanding work. The split is mechanical: every entry that was in BACKLOG.md before the split appears in exactly one of the two files now. Cross-references to closed entries are by name (e.g. "AD-39") and resolve to the history file. Section headers (##) are preserved in both files where they have at least one entry of the relevant class.


How to read this

Work is grouped by substrate, with each item numbered in its historical ID (e.g. DF-1, C-3, A-2) so external references don't break. Status is tracked per item:

  • [x] Done: implemented, landed on master, acceptance criteria met.
  • [~] Fix applied, awaiting verification: code change is in place but confirmation on the target host is pending. Flips to [x] once the relevant test run or smoke check comes back clean.
  • [ ] Open: not yet started.
  • [ ] Deferred: consciously postponed, with a note explaining why.

Priorities are P0 (project-level invariant or blocker), P1 (near-term, directly unblocks downstream work), P2 (valuable but not on the critical path), or unset for items that don't need ranking.

All seven implementation waves (the original chronofs-anchored dependency chain) are complete. The current theme is: make DRM strictly optional, preserve the DRM-less default path as UTF's unbreakable invariant while allowing opt-in DRM for users who want it.


Current theme: make DRM strictly optional

The goal is that the DRM-less swap path remains the unbreakable default and that DRM/KMS support is a strictly optional add-on. A user running sh configure.sh and accepting defaults must produce a drawfs.ko with no DRM references, no drm-kmod build dependency, and no drm-kmod load dependency.

Non-goals

  • Making DRM the default. It will never be the default.
  • Detecting drm-kmod automatically. Autodetection leaks the opinion that "DRM is better" into the build; it is not.
  • Removing drawfs_drm.c or the kernel-side #ifdef DRAWFS_DRM_ENABLED gates. They are correct already.
  • Surfacing DRM backend selection through semadrawd CLI. semadrawd -b drawfs is agnostic to the kernel backend.

Project-level invariants

These hold across all changes. Any future work that would break one needs its own backlog item first, documenting why the invariant is changing.

  1. sh configure.sh with all defaults → swap-only drawfs.ko.
  2. drm-kmod is never a build-time or load-time hard dependency.
  3. hw.drawfs.backend defaults to "swap" at module load.
  4. DRM init failure at module load falls back to swap, never panics, never prevents load.
  5. Renaming DRAWFS_DRM_ENABLED requires coordinating with every #ifdef in drawfs.c, drawfs_drm.c, and both Makefiles.
  6. UTF_OS detection is informational only. Any future use that branches build behavior on it must be justified by a concrete, observable divergence on the FreeBSD target, not a speculation.
  7. UTF depends only on code written with UTF's guarantees in mind. External dependencies are either replaced by UTF-owned code or explicitly accepted as named platform-transport dependencies. See docs/UTF_ARCHITECTURAL_DISCIPLINE.md for the accepted list and the three postures (Replace / Accept / Remove).

drawfs: kernel spatial substrate

/dev/draw character device, surface lifecycle, mmap-backed pixel buffers, framed binary protocol, input event injection.

[ ] DF-6: DRM backend runtime wiring (Open, Medium; depends: DF-3; filed 2026-05-21; not scheduled)

Tracks: drawfs/sys/dev/drawfs/drawfs.c (drawfs_reply_surface_present), drawfs/sys/dev/drawfs/drawfs_drm.c (drawfs_drm_surface_present), docs/DF4_VERIFICATION.md (AD-18.7 fix design).

DF-3 closed as "Done, skeleton": drawfs_drm.c contains a full DRM/KMS implementation (connector enumeration, mode set, dumb buffer allocation, page flip), but the SURFACE_PRESENT handler in drawfs.c does not dispatch into it. grep finds no callers of drawfs_drm_surface_present. DF-6 connects the two so that, when hw.drawfs.backend=drm is selected on hardware with a matching KMS driver, a client's SURFACE_PRESENT actually drives a page flip.

Scope:

  • Add a backend-dispatch step in drawfs_reply_surface_present (drawfs.c line ~459, after the surface lookup succeeds) that, when the active display is DRM-backed and the surface is in a state suitable for present, calls drawfs_drm_surface_present(dd, surf, damage, damage_count).
  • Resolve the locking-order question for the call: the handler currently holds s->lock over the lookup but not over the reply send; the DRM backend takes dd->drm_mtx. The two locks are unrelated by ordering convention but the call must not nest s->lock under dd->drm_mtx or vice versa in a way that creates a new WITNESS warning.
  • Apply the AD-18.7 fix in drawfs_drm_surface_present: refactor the function to capture flip parameters under dd->drm_mtx, release the lock, call drm_ioctl_kern(DRM_IOCTL_MODE_PAGE_FLIP, ...) unlocked, re-acquire the lock to install flip_pending = 1 and perform the front/back swap. Same shape as AD-18.2 (vm_pager_allocate refactor) and AD-18.3/4 (M_WAITOK malloc refactor): capture, release, slow call, re-acquire, install or yield.
  • Page-flip completion: today dd->flip_pending is set but nothing clears it. A kthread reading from the DRM event queue and toggling the flag on DRM_EVENT_FLIP_COMPLETE is the production answer. Initial DF-6 may set up the kthread skeleton without requiring complete event-queue handling.
  • Bench verification under PGSD-DEBUG: WITNESS clean on sustained present workload, no leak in vmobj_allocs == vmobj_deallocs, flip_pending clears between presents (when the kthread lands), front/back pointers stay consistent across many cycles.

Hardware dependency: DF-6 cannot be runtime-verified without DRM-capable hardware running a FreeBSD KMS driver (amdgpu, i915kms, nvidiakms, radeonkms). The PGSD test machines today use efifb only. DF-6 is therefore filed but not scheduled until matching hardware is available or a virtualised KMS environment is set up (VirtualBox VGA does not present a KMS DRM device; qemu's virtio-gpu does).

Discharges: AD-18.7. The fix design for that locking bug is captured in the AD-18 entry and applies cleanly in this work.

Filing note: DF-6 was filed 2026-05-21 during a status audit that identified an inconsistency in BACKLOG: AD-18.7 was tagged "deferred to DF-3" but DF-3 had closed as "skeleton" without the wiring AD-18.7 was waiting for. DF-6 is the missing piece that makes the dependency chain honest. AD-18.7's deferral target was re-tagged from DF-3 to DF-6 in the same commit.


Deferred

[~] B3.3: Damage / partial-update swap-path implementation (Pass 1 in tree; Pass 2 and Pass 3 not landed. Status corrected 2026-05-27 evening; previously claimed "Done" in error.)

Three-pass implementation of DRAWFS_REQ_SURFACE_PRESENT_REGION in the swap-backed kernel path. Status accurate as of the 2026-05-27 correction; ground truth is the source tree, not the previous claim.

  1. Pass 1 (validator): landed. Pure function drawfs_req_surface_present_region_validate in drawfs_frame.c enforcing the full error table from the design doc. 15 userspace unit tests pass; kernel compile clean on the FreeBSD target. The validator's own comment at drawfs_frame.c:77 is explicit that the work it does not do ("does not consult session state, does not look up the surface, does not clamp rects to surface bounds, and does not allocate") is deferred to Pass 2.
  2. Pass 2 (dispatch + coalescing + sysctl): not landed. The intended handler drawfs_reply_surface_present_region in drawfs.c does not exist; the dispatch switch at drawfs.c:1400-1444 has no case for DRAWFS_REQ_SURFACE_PRESENT_REGION (only the original DRAWFS_REQ_SURFACE_PRESENT). No hw.drawfs.region_coalesce_threshold sysctl is registered (verified by grep over drawfs/sys/dev/drawfs/*.c; the only sysctls in the module are the dev_uid/gid/mode, mmap, evq, surface-budget, and vmobj-debug ones, plus hw.drawfs.coalesce_events from earlier work). A non-PGSD client issuing DRAWFS_REQ_SURFACE_PRESENT_REGION today would hit the dispatch default arm and receive ERR_UNSUPPORTED_CAP, which is consistent with the design doc's backward-compatibility section but is NOT what "Done" would mean.
  3. Pass 3 (integration tests): partially landed (test file only; no implementation under test). drawfs/tests/test_surface_present_region.py exists with the 18 test cases described, but with Pass 2's handler absent the tests cannot exercise dispatch, clamping, coalescing, or the N=1-full-surface equivalence invariant against a real server. The test file is checked in but is not bench-verified against landed code.

Corroborating evidence for the corrected status: BACKLOG.md line 282-283 (in a separate entry, written later) already notes "the validator added in pass 1 still has no callers" - that statement contradicts the "Done" claim on this entry and was correct. This entry was the stale one.

How the misclaim happened (best reconstruction without git history for these commits): the entry appears to have been written speculatively, describing what B3.3 would be when all three passes landed, then marked [x] Done before Pass 2 and Pass 3 actually shipped. The validator's defer-to-Pass-2 comment was preserved through whatever sync brought the working tree to its current state, but the dispatch+handler+sysctl were not.

Implementation impact going forward: AD-43.3b's plan (paused at 2026-05-27 evening, awaiting Scenario 1 vs Scenario 3 decision per the AD-43 evening update) depends on Pass 2 + Pass 3 actually landing. The work estimated by the original B3.3 entry ("18 userspace unit tests on clamp and threshold arithmetic pass; kernel compile clean, sysctl exposed on target") is still ahead, not behind. Effort estimate: small-to-medium kernel-side work plus a re-bench of the existing Python test fixture.

Design choices documented in drawfs/docs/DESIGN-surface-present-region.md: sum-of-areas coalescing (not true union), single event type (EVT_SURFACE_PRESENTED_REGION) regardless of collapse, no cross-request region-event coalescing.

[ ] B3.4–B3.5: Damage / partial-update: DRM path and semadraw emitter (Deferred, P2; depends: B3.3)

With the swap path complete (B3.3), the remaining implementation is:

  1. B3.4: DRM path. drmModeDirtyFB when the kernel DRM driver supports it, full-present fallback otherwise. Only meaningful with DRAWFS_DRM_ENABLED. Requires access to a drm-kmod-enabled FreeBSD 15 host to exercise end-to-end.
  2. B3.5: semadraw emitter. Extend semadraw/src/backend/drawfs.zig to emit region presents when the compositor's damage tracker produces a bounded rect set. Requires B3.4 to be landed first for end-to-end testing.

Non-goals and acceptance criteria are documented in full at drawfs/docs/DESIGN-surface-present-region.md.


Long-term: Quartz Equivalent on UTF

These items represent the path toward a native GNUstep/AppKit display stack on UTF, a Quartz equivalent that requires no X11. They are long-term architectural goals, not near-term sprint items.

Background. UTF already provides the lower half of this stack: drawfs owns the framebuffer (/dev/draw), semadrawd is the compositor, SDCS is the drawing command stream, and the EFI framebuffer backend means the stack runs on any UEFI machine without a GPU driver. What is missing is the retained-mode layer model above SDCS that Quartz Compositor provides, and a GNUstep display backend that targets semadraw rather than X11.

[ ] LT-1: Layer Tree Protocol on top of SDCS (Open, Large)

Depends on: SDCS stable, semadrawd compositor operational

Surfaces become layers with transform, opacity, clip, and z-order properties. Clients describe a retained scene graph rather than pushing raw pixel commands each frame. semadrawd composites the layer tree rather than blitting each surface independently.

Key design points:

  • Extend the semadraw IPC protocol with SET_LAYER_TRANSFORM, SET_LAYER_OPACITY, SET_LAYER_CLIP messages
  • semadrawd maintains a retained layer tree per client session
  • Only damaged layers are re-rendered each frame
  • Layer properties are animatable (see LT-2)
  • Implementation lives in semadraw/src/daemon/layer_tree.zig

[ ] LT-2: Animation Engine driven by the chronofs Clock (Open, Large)

Depends on: LT-1, chronofs ChronofsClockSource wired into semadrawd frame scheduler

An animation engine that interpolates layer properties between frames, driven by the chronofs audio-hardware clock. This is the UTF equivalent of Core Animation's display link and implicit transaction model.

Key design points:

  • Animations are submitted as (property, from, to, duration, curve) tuples via the semadraw IPC protocol
  • The frame scheduler calls nextFrameTarget() from chronofs to determine the next sample-aligned frame boundary
  • Property values are interpolated at each frame boundary and applied to the layer tree before compositing
  • Animations are drift-free by construction, clocked against audio hardware rather than wall time, eliminating audio/visual skew
  • Easing curves: linear, ease-in, ease-out, ease-in-out, spring

[ ] LT-3: GNUstep Backend targeting semadraw instead of X11 (Open, Large)

Depends on: LT-1, LT-2; libs-opal and libs-quartzcore in GNUstep upstream

A GNUstep display backend (back-semadraw) that implements GSDisplayServer against semadraw rather than X11. This allows the full GNUstep/AppKit application stack to run natively on UTF without X11 as an intermediary, on any UEFI machine including older hardware with no GPU driver.

Key design points:

  • back-semadraw implements GSDisplayServer using the semadraw client library (libsemadraw)
  • Opal (2D drawing, PDF model) maps its drawing operations to SDCS commands
  • QuartzCore (layer compositing, Core Animation) maps to the LT-1 layer tree protocol and LT-2 animation engine
  • Applications run unmodified on bare metal FreeBSD via any UTF backend: EFI framebuffer (any UEFI machine, no GPU driver required), Vulkan (GPU-accelerated), or X11 (compatibility mode)
  • Makes UTF the FreeBSD analog of Quartz Compositor on macOS, with GNUstep as the application framework above it

NDE: Native Desktop Environment

NDE is the policy and user experience layer above semadraw and drawfs. It lives at https://github.com/pgsdf/NDE and defines versioned contracts for windowing policy, input, settings, session management, and compatibility. NDE does not redefine kernel graphics transport or semantic rendering; those remain the responsibility of drawfs and semadraw respectively.

NDE Milestone 0 (vocabulary freeze, charter, design specification, repository skeleton) is complete. The items below correspond to NDE Milestone 1 (substrate validation) and beyond.

Relationship to LT-1 through LT-3. NDE is usable today without the long-term Quartz equivalent items; it can manage semadraw-term sessions and basic SDCS applications using the current immediate-mode rendering model. LT-1 (layer tree) would make NDE's own UI smoother and enable proper animated transitions. LT-3 (GNUstep backend) would make GNUstep applications first-class NDE citizens without X11.

[ ] NDE-1: Surface Manager (Open, Medium)

Depends on: semadrawd compositor operational (done) Tracks: NDE Milestone 1, substrate validation

Implement the NDE windowing policy contract (DESIGN.md §3.2): toplevel surfaces, popups, stacking rules, focus transitions, server-side decorations. NDE acts as a privileged semadraw client that manages surface z-order and focus on behalf of all other clients.

Key design points:

  • NDE registers with semadrawd as the window manager client
  • Surface stacking is controlled via SET_Z_ORDER messages
  • Focus ownership follows DESIGN.md §3.2 semantics
  • Server-side decorations rendered as NDE-owned surfaces overlaid on application surfaces

[ ] NDE-2: System Bar (Open, Small–Medium)

Depends on: NDE-1 Tracks: NDE Milestone 2, daily driver core

A persistent surface at a fixed screen edge showing: active application name, workspace indicator, clock, and system status. Rendered entirely in SDCS via libsemadraw.

[ ] NDE-3: Launcher (Open, Medium)

Depends on: NDE-1 Tracks: NDE Milestone 2, daily driver core

Application discovery and launch. Reads a manifest of installed NDE applications, presents a keyboard-navigable launcher surface, and spawns selected applications as managed semadraw clients.

[ ] NDE-5: X11 Compatibility Bridge (Open, Large)

Depends on: NDE-1, SM-1 Tracks: NDE Milestone 3, compatibility

Rootless X11 server integration: map X windows to semadraw surfaces, translate input and clipboard, integrate drag and drop. IME integration path required for international use.

Classification note: the NDE DESIGN.md originally described the X11 bridge as "mandatory for usability." This has been revised to required for compatibility. UTF now has a native terminal (semadraw-term) and the long-term path (LT-3) provides native GNUstep application support without X11. The X11 bridge remains important for running existing legacy X11 applications but is no longer a prerequisite for the environment to be usable.

SM: Session Management

SM is the PGSD distribution layer's session-management track, covering authentication, login, session lifecycle, and related concerns. It is desktop-agnostic by design: SM components do not depend on NDE, do not encode NDE-specific behaviour, and could in principle be reused by a different distribution built on UTF.

The track was opened on 2026-05-10 to replace the original NDE-4 "Session Manager" entry, which conflated session management with desktop work. See docs/sessions/2026-05-10.md for the architectural reasoning and pgsd-sessiond/docs/adr/0001-design.md for the SM-1 design.

The pgsd- prefix marks distribution-layer components, distinct from UTF userland's sema- prefix. UTF has stable substrate contracts; PGSD is one set of choices on top of those contracts.

[ ] SM-2: screen lock daemon (Open, Small-Medium)

Depends on: SM-1.

A screen-lock daemon that reuses SM-1's PAM stack to require re-authentication after idle timeout or explicit user request, without tearing down the user's session. Out of v1 SM-1 scope; opens as a separate item when SM-1 is far enough along to expose the right hooks.

Design pending. Likely a small daemon launched per-session by the user's session leader, sitting alongside whatever desktop environment is running. Unlocks the screen via the same PAM conversation SM-1 uses for login, but does not exec a new session leader; it just dismisses the lock surface.

[ ] SM-3: idle and power management integration (Open, Medium)

Depends on: SM-2 (for the idle-detection plumbing).

Idle-state tracking and power-management integration: blanking the display after configurable idle, suspending the system on deeper idle, handling the resume-from-suspend case (which may or may not require re-authentication depending on the operator's policy via SM-2). Out of scope for v1 SM-1.

This is the natural home for "agent-of-record for XDG_RUNTIME_DIR-equivalent" if PGSD chooses to ship one. The ADR for SM-3 will decide whether to adopt the FreeDesktop convention, define a PGSD-native equivalent, or punt entirely.

Architectural Discipline

The project's discipline (UTF depends only on code written with UTF's guarantees in mind) is stated in full at docs/UTF_ARCHITECTURAL_DISCIPLINE.md. This section tracks the work streams that apply the discipline to subsystems where external dependencies currently sit inside UTF's guarantee path. Items here represent multi-stage replacements, not individual features; each item typically has its own design document or proposal that details the stages.

[~] AD-1: inputfs: native input substrate (In progress, Large)

Tracks: inputfs/docs/inputfs-proposal.md and inputfs/docs/foundations.md.

Replace the evdev / bsdinput / libinput dependency chain with inputfs, a UTF-owned kernel input substrate. Publishes input state and events via shared memory, timestamps with the UTF dual-clock (monotonic + audio-sync), routes events via compositor-driven focus. Closes the coordinate-space bug (previously tracked as D-6 and superseded by this item), eliminates device-accumulated coordinates, and removes userspace semainputd as a component (see AD-2).

Status: Stages A, B, C, and D complete (all eight Stage D sub-stages landed: D.0a, D.0b, D.1, D.2, D.3, D.4, D.5, D.6). The chronofs ts_sync integration deferred from Stage C landed 2026-05-05 and is partially verified on bare metal: every event now stamps ts_sync from a kthread-refreshed cache of /var/run/sema/clock at emit time. The clock-absent / clock_valid = 0 failure path is verified end-to-end (clock file shows byte 5 = 0, events show ts_sync = 0, no log spam, fall-through is correct). The non-zero stamping path requires an active semaaud client driving a real audio stream (no such client exists in the tree today; OSS shim writes to /dev/dsp bypass semaaud); final verification of non-zero ts_sync is therefore deferred until a semaaud client is available, likely as a side effect of AD-3 audio-output work or earlier audio-app integration. The success path is structurally trivial given the failure path works (a memcpy of 8 bytes plus an atomic load, no intermediate logic), but trivial isn't verified, so the partial-verification status is accurate. The HUP_DIGITIZERS parser sub-item, formerly listed here as deferred, landed and was verified on bare metal 2026-05-06 (commits 9bb35ff steps 2+3, fb018da steps 4+5, closure 183410a; classifier role, field locator, report-byte parser, and Touchpad Mode feature-report send all implemented, wired in at inputfs.c, and verified end-to-end against the HAILUCK 0x258a:0x000c trackpad). See the "Sub-item: HUP_DIGITIZERS parser for Win8+ multi-touch" section below, which carries the full Done status and verification evidence; this summary previously contradicted it and is corrected here. Two AD-1 sub-items now remain post-Stage-D and keep this entry at [~] rather than [x]: pollable-fd / kqfilter for the events ring (small-medium; semadrawd's existing poll-and-drain loop absorbs events fine in practice; the pollable fd is a latency improvement, not a correctness gate; verified absent from inputfs.c as of this audit); and a separate parallel sub-item for Apple multi-touch trackpad support (medium-large; vendor-specific protocol distinct from Microsoft Win8+ HUP_DIGITIZERS, requires reverse-engineered prior-art reference; verified absent from the inputfs sources as of this audit; not gating Phase 2.5 multi-touch verification, which targeted the HAILUCK HUP_DIGITIZERS path and is now closed). Both remaining items appear in their longer form near the end of this entry. Stage E (semainputd retirement, AD-2) was completed: AD-2 closed 2026-05-17 (see AD-2 entry in BACKLOG-history.md); AD-9 hardening was completed before that cutover.

Stage A delivered the proposal, foundations, UTF_ARCHITECTURAL_DISCIPLINE.md, ADRs 0001 through 0011, and four byte-level companion specs (shared/INPUT_STATE.md, shared/INPUT_EVENTS.md, shared/INPUT_FOCUS.md, and shared/INPUT_IOCTL.md). Stage B delivered HID attachment via hidbus, descriptor parsing, interrupt handler registration, raw report hex logging, and per-device role classification. Stage C delivered userspace publication of the state region and event ring. Stage B and Stage C sub-stage detail follows.

Stage B sub-stages:

  • B.1 module skeleton loads and unloads cleanly: landed, verified.
  • B.2 device attachment on hidbus with HID TLC matching per ADR 0007: landed, verified on Razer Viper (live system) and VirtualBox USB Tablet (VM).
  • B.3 HID report descriptor fetch and walk per ADR 0008: landed, verified on VirtualBox USB Tablet (85-byte descriptor, 11 input items, depth 2).
  • B.4 interrupt handler registration via hidbus_set_intr and raw report hex logging per ADR 0009: landed, verified on a physical USB mouse passed through to a FreeBSD VirtualBox VM. Live reports flow with non-zero motion deltas during use; inputfs0: detached on unplug; clean kldunload with no dmesg warnings.
  • B.5 per-device role classification into softc bitmask per ADR 0004 and ADR 0010: landed, verified on the PGSD kernel on bare metal. Six USB HID devices across three TLC classes attached and classified correctly: ELECOM BlueLED Mouse (vendor=0x056e, product=0x00e3, roles=pointer); HAILUCK touchpad keyboard TLC (vendor=0x258a, product=0x000c, roles=keyboard); HAILUCK touchpad mouse TLC (same vendor:product, roles=pointer); Broadcom Bluetooth keyboard TLC (vendor=0x05ac, product=0x8294, roles=keyboard); Broadcom Bluetooth mouse TLC (same vendor:product, roles=pointer); Apple Keyboard (vendor=0x05ac, product=0x021d, roles=keyboard). Report flow verified at 640 lines for sustained mouse input. Clean kldunload produced six detached lines and no dmesg warnings.

ADR 0006 was drafted against legacy ukbd/ums reference drivers that are not loaded on modern FreeBSD 15; it is superseded by ADR 0007 (hidbus attachment). The shipped code attaches at hidbus and works against the modern HID stack. ADR 0008 carries an errata section recording a hid_start_parse kindset correction made during B.3 verification.

Verification environment note (B.5). Bare-metal verification on stock FreeBSD is structurally blocked: stock FreeBSD compiles hkbd statically into the GENERIC kernel and ships hms, hkbd, hcons, hsctrl, and other competing HID drivers as auto-loadable modules with linker.hints registrations. The ADR 0009 workflow of unloading competing drivers cannot succeed against statically compiled code, and even when modules are unloaded at runtime the kernel auto-load machinery reloads them on the next USB event. The PGSD kernel resolves this: nodevice lines remove the competing drivers from the static kernel image (see pgsd-kernel/PGSD), and the build-produced .ko files in /boot/kernel/ are moved aside before verification so linker.hints cannot find them to autoload (a stopgap; the durable answer is WITHOUT_MODULES in /etc/src.conf, tracked under AD-8). With both kernel image and module files clean of competitors, inputfs binds at hidbus without contention and all four B.5 signals pass. Earlier VirtualBox-based verification in this project's history exercised the mouse path on a Razer Viper but is no longer the reference: PGSD targets bare-metal FreeBSD, and B.5's verifying evidence is the bare-metal PGSD-kernel run captured in b5-pass2-baremetal.log. The verification protocol in inputfs/docs/B5_VERIFICATION.md documents the workflow.

Stage C: state publication. Per the inputfs proposal, Stage C made inputfs's internal state visible to userspace through three shared-memory regions under /var/run/sema/input/. The regions are specified in inputfs/docs/adr/0002-shared-memory-regions.md with byte-level layouts in shared/INPUT_STATE.md, shared/INPUT_EVENTS.md, and shared/INPUT_FOCUS.md (all landed as Stage A artifacts). Stage C implemented against those specs. semainputd remained unchanged; evdev still drove production; inputfs gained a user-visible output but no consumers yet.

Stage C broke into five sub-stages, mirroring Stage B's rhythm: each sub-stage landed and was verified independently before the next started. Sub-stage detail follows.

  • C.1 shared/src/input.zig library: StateWriter/StateReader, EventRingWriter/EventRingReader, FocusWriter/FocusReader. Mirrors the clock.zig pattern. Pure Zig, userspace-testable with unit tests. No kernel work, no hardware dependency. Lands the API surface that the kernel writer (C.2, C.3) and the CLI reader (C.4) both build against. Landed 2026-04-27 with 15 passing unit tests covering size constants, parent dir creation, magic rejection, pointer and device round-trips, ring drain ordering, ring overrun, and focus pointer resolution.
  • C.2 kernel state-region writer in inputfs.c: creates /var/run/sema/input/state on module load per the byte layout in shared/INPUT_STATE.md and the regions decision in inputfs/docs/adr/0002-shared-memory-regions.md. Publishes device inventory from B.5's softc role bitmask, updates the seqlock-protected fields on every event admission. Pointer position is published in raw device space; coordinate transform to compositor space is Stage D work (per ADR 0002 §Decision item 5, the transform mechanism is deferred). Landed 2026-04-27 with end-to-end verification on PGSD-bare-metal: six HID devices (ELECOM mouse, HAILUCK touchpad keyboard and pointer, Broadcom Bluetooth keyboard and pointer, Apple Keyboard) reporting correct vendor, product, roles, and names. Architecture: 11,328-byte module-global live buffer, MTX_SPIN serialization, kthread worker syncing via vn_rdwr.
  • C.3 kernel event-ring writer in inputfs.c: creates /var/run/sema/input/events, appends events to the ring on every interrupt callback (the path that currently logs hex to dmesg in B.4). Sequence numbers strictly monotonic. ts_ordering comes from the kernel monotonic clock; ts_sync either wired to chronofs (preferred, gives ADR 0011 measurement substrate) or left zero (the spec allows it). Pollable fd via kqueue. Landed 2026-04-27 with verification on PGSD-bare-metal: 224 pointer.motion events plus left and right button cycles, all with strictly monotonic seqs and timestamps. Per-event publication uses partial vn_rdwr writes (slot plus header, ~128 bytes per typical sync). The pollable fd is deferred to a follow-on sub-stage. ts_sync left zero; chronofs integration also deferred. Keyboard, touch, and pen events deferred (need descriptor-driven parsing).
  • C.4 inputdump CLI tool in Zig under inputfs/tools/, parallel to chronofs/tools/chrono_dump.zig. Reads the state region and event ring, presents them. Useful for verification end-to-end and for ad-hoc debugging. Landed 2026-04-27 with four subcommands (state, events, watch, devices), human-readable and --json output, and event filtering by role, device slot, and event type. The C.2/C.3 throwaway inputstate-check.zig was deleted in the same commit.
  • C.5 verification protocol (inputfs/docs/C_VERIFICATION.md) plus scripts under inputfs/test/c/: signals for region creation, header validity, device inventory publication, event ring monotonicity, pollable-fd wakeups, clean unload. Pattern follows B.5's verification protocol. Landed 2026-04-27 with c-verify.sh (top-level orchestrator running seven phases end-to-end) and c-fixtures.sh (sourced helper library). Pollable-fd verification deferred along with the pollable fd itself; the protocol document notes the placeholder.

The Stage A focus region (shared/INPUT_FOCUS.md) is part of C.1's library deliverable: FocusWriter/FocusReader belong in shared/src/input.zig because the API surface is shared. The kernel-side use of FocusReader (consuming compositor focus to route events) is Stage D work, not Stage C.

The state region's spec describes pointer_x/pointer_y as compositor-space. Stage C publishes them in raw device space because inputfs has no transform machinery yet; that machinery arrives in Stage D. The state region remains structurally correct across the transition; only the semantics of what's in those two fields changes.

C.2 kernel-side considerations (historical, pre-implementation design notes; the choices below were made and the implementation landed accordingly). The state region is 11,328 bytes on disk, single-writer (the kernel), multiple-reader (userspace). Userspace consumers mmap the file shared and read via StateReader from shared/src/input.zig; the kernel cannot link userspace Zig and instead writes the same byte layout from kernel context. Several FreeBSD-specific decisions shape the implementation:

  • File creation and write path. The kernel cannot mmap a userland filesystem path the way userspace does. The two viable patterns are (a) vn_open plus vn_rdwr from a kthread context, opening /var/run/sema/input/state as a regular file and overwriting it byte-for-byte on every state update, or (b) maintaining the canonical state in a kernel-resident buffer and bouncing updates to userland via a helper. Neither pattern has precedent in the UTF codebase: existing userland files under /var/run/sema/ (the audio clock, the session token) are written by userspace daemons. inputfs C.2 is the first kernel-context writer of a /var/run/sema/ file. Pattern (a) is the simpler path. C.2 will start with (a) and measure; pattern (b) becomes a tractable optimisation if (a)'s overhead is intolerable.
  • Mutex strategy. B.5's sc_mtx per softc protects per-device state during attach, classification, and the interrupt path. The state region adds a global resource: the seqlock counter, the device inventory array, and the per-event last_sequence value all need atomic-multi-field-update semantics. A new module-global mutex (provisionally inputfs_state_mtx) will bracket seqlock increments and field writes; the per-softc sc_mtx remains for per-device state. Order is sc_mtx then inputfs_state_mtx to avoid deadlock on attach.
  • Writer context. State updates land from interrupt callback context (B.4's inputfs_intr path). Vnode I/O from interrupt context is forbidden in FreeBSD; that means the writer cannot call vn_rdwr directly from inputfs_intr. The interrupt handler must enqueue the state update onto a kthread-backed worker that performs the vnode write outside interrupt context. This is a non-trivial dispatch boundary and is the chief reason C.2 is sized larger than C.1.
  • Unload semantics. On kldunload, the state region file is left in place (per the spec's "file persists; next load resets it" lifecycle note). The kthread worker must drain pending writes before the module unloads to avoid use-after-free on the softc state.
  • Module-load message. inputfs_modevent's current MOD_LOAD printf advertises Stage B.5. C.2's commit updates that string to reflect state-region publication and drops the "no userspace event delivery" qualifier (which becomes false at C.3, not C.2; C.2 publishes state but not yet the event ring).

C.5 verification signals (preview) (historical, pre-implementation design notes; the verification protocol that landed in C.5 covers all of these signals plus several more). When C.2 lands the verification protocol in inputfs/docs/C_VERIFICATION.md should exercise, in the pattern established by b5-verify-reports.sh:

  • State file presence and permissions: /var/run/sema/input/state exists after kldload inputfs, is STATE_SIZE bytes (11,328), is readable by the user account that runs userspace tools.
  • Header validity: magic decodes to INST (0x494E5354), version is 1, state_valid transitions 0 to 1 once the first device attaches.
  • Device inventory: the populated slots in the device array match the attached devices observed in dmesg after B.5's roles= lines, with roles bitmasks consistent with B.5's classification.
  • Seqlock toggling: under sustained input, seqlock advances by even pairs (writer increments twice per update); a userspace inputdump (C.4) capturing N snapshots over a recorded interval observes monotonic advance.
  • Clean unload: kldunload inputfs completes without panics, the kthread worker drains, the state file persists with state_valid = 1 until the next load truncates it.

These signals are concrete enough to write the verification script against once C.2 and C.4 are both landed.

Stage C closeout (2026-04-27). All five sub-stages landed and were verified end-to-end on PGSD-bare-metal with six HID devices: the ELECOM BlueLED Mouse, HAILUCK touchpad keyboard and pointer TLCs, Broadcom Bluetooth keyboard and pointer TLCs, and Apple Keyboard. State region and event ring publish correctly, magic and version match the spec, device inventory matches dmesg, lifecycle events fire one per attaching device with monotonic seqs, pointer.motion events stream from the ELECOM mouse, button transitions emit pointer.button_down and pointer.button_up correctly. Module load, unload, and reload cycles are clean with no M_INPUTFS leaks. The verification protocol at inputfs/docs/C_VERIFICATION.md captures the full test recipe; inputfs/test/c/c-verify.sh reports 26 of 26 automated checks passing.

Stage C deferred items. Three items were scoped out of Stage C; their disposition is now:

  • Pollable fd. Still deferred. The /dev/inputfs cdev with kqfilter and EVFILT_READ support, so userspace consumers can block on events instead of polling the ring. Stage C's userspace consumers poll at an interval (the inputdump default is 100 ms); this is fine for a diagnostic tool and semadrawd's main poll loop absorbs events without measurable drop in practice. The pollable fd is a latency improvement rather than a correctness gate; can wait until AD-2 surfaces a need.
  • chronofs ts_sync integration. Landed 2026-05-05 as part of AD-1's tail; partially verified on bare metal. Every event now stamps ts_sync from a kthread-refreshed cache of /var/run/sema/clock at emit time. Cache refresh follows the D.1 focus-reader pattern (vn_rdwr from kthread, spin-locked snapshot for interrupt consumers); the kernel does not mmap the clock file. When the clock file is absent (semaaud not running), magic or version mismatch, or clock_valid = 0, ts_sync falls through to the documented 0 sentinel — no regression for consumers that already handle the unavailable case. The failure path is verified end-to-end: clock byte 5 = 0 → events emit with ts_sync = 0 → no log spam → fall-through is correct. The non-zero stamping path (clock_valid = 1 and samples_written advancing) is not verified because no semaaud client exists in the tree today; OSS shim writes to /dev/dsp bypass semaaud's userland accept loop. Final verification of non-zero ts_sync is deferred until a semaaud client exists, likely as a side effect of AD-3 audio-output work or earlier audio-app integration. The success path is structurally trivial given the failure path works (a memcpy of 8 bytes plus an atomic load, no intermediate logic), but the verification is honest about what was tested and what wasn't.
  • Descriptor-driven event generation for keyboard, touch, pen, and scroll. Keyboard, scroll, and basic descriptor-driven pointer parsing landed under Stage D (D.0a + D.0b). Touch and pen remain deferred per ADR 0012; they need digitizer hardware to verify and are tracked in this entry's tail alongside the pollable-fd item.

Stage D: focus routing and coordinate transform. Stage C publishes input data in raw device space; Stage D adds the transform machinery that maps device coordinates to compositor space, and consumes the focus region to route events to the correct session. Stage D is scoped in inputfs/docs/adr/0012-stage-d-scope.md, which records the design decisions made during Stage D scoping (sysctl-based geometry exposure from drawfs, kernel-side focus routing in inputfs, stamp-and-filter session_id placement, transform_active byte for coordinate semantics signaling, hw.inputfs.enable tunable semantics, and descriptor-driven event scope).

Stage D breaks into eight sub-stages, each landed and verified independently before the next starts. The dependency order is approximately D.0a or D.0b first (independent of each other), then D.1 and D.2 (independent of each other), then D.3 and D.4 (D.3 depends on D.2 and D.0a; D.4 depends on D.1), then D.5, then D.6.

  • D.0a descriptor-driven pointer events: replace boot-protocol parsing with hid_locate-based extraction at attach + hid_get_data calls at interrupt time. Adds report-ID dispatch for devices with multiple top-level collections. Adds scroll-wheel event type if HUG_WHEEL is present. Landed (commits 123a2b4 and 309329d).
  • D.0b descriptor-driven keyboard events: emit keyboard.key_down / keyboard.key_up from descriptor-driven parsing of the modifier byte and the keys-held array under HUP_KEYBOARD. Tracks held keys in the softc to compute transitions. Modifiers carried in each event's payload field (per existing shared/INPUT_EVENTS.md spec); no separate modifier-transition events. Landed (commit 42dfd57).
  • D.1 kernel-side FocusReader equivalent in C: mmap the focus file at module load (or first use), retry until focus_valid = 1, snapshot under the seqlock retry protocol, surface keyboard_focus, pointer_grab, and surface_map for routing. Landed (commits 35ab475 and 948d346). Implementation uses vn_rdwr against a cached buffer rather than mmap; the kthread refreshes via bounded msleep_spin every ~100 ms, and inputfs_focus_snapshot is safe to call from interrupt context under spin lock. Seqlock retry is folded into the refresh-then-validate cycle.
  • D.2 drawfs geometry sysctl: drawfs publishes display geometry under hw.drawfs.efifb.*; inputfs reads at module load via kernel_sysctlbyname, falls back to a conservative default if the sysctls are absent. Landed (commits f7cb38f, 8804e60, and 732f737).
  • D.3 coordinate transform: clamp pointer position to display bounds learned from D.2, publish in compositor pixel space, set transform_active = 1 in the state region header. Seed pointer to display centre on first activation. Landed (commit e644594).
  • D.4 routing application: stamp events with session_id from the focus snapshot, synthesise pointer.enter and pointer.leave events when surface-under-cursor changes between successive pointer events. Apply keyboard-focus routing (events delivered to keyboard_focus if non-zero). Landed (commit 0c610fd).
  • D.5 hw.inputfs.enable tunable: gate publication. When 0, inputfs is fully inert (no state updates, no ring updates, state_valid = 0, events_valid = 0). When 1, full publication. Clean valid-byte transitions on flip. Landed (commit d0dd1fc).
  • D.6 Stage D verification protocol: extend c-verify.sh (or write a new d-verify.sh) and a D_VERIFICATION.md document. Mirrors C.5's automated phases plus a manual checklist for keyboard events (D.0b), transform behaviour (D.3), routing (D.4), and the tunable's transitions (D.5). Landed (commit f5e2ada); chose new d-verify.sh rather than extending c-verify.sh.

Touch and pen events are explicitly out of scope for Stage D (per ADR 0012); they are tracked as a separate AD-1 sub-item post Stage D. The chronofs ts_sync integration (Stage C deferred item) also stays separate from Stage D unless D.6 verification surfaces a need for it.

Closed finding (2026-05-06): D.3 emits motion events with non-zero dy while y stays at 0 (was Small). Originally surfaced during AD-2a Phase 1 verification via inputdump events. When the pointer was held at a screen edge, motion events reported y = 0 and dy = -N for many frames in a row even though the position didn't change, producing phantom drift in consumers that integrate (dx, dy) to maintain their own pointer state.

The fix landed in inputfs_state_update_pointer: the function now returns the post-clamp delta (the actual change in position after edge clamping) via two out-parameters, and the motion event emitter writes those into the payload's dx/dy fields instead of the raw HID deltas. When clamping is inactive (geometry unknown), the post-clamp deltas equal the raw deltas, so behaviour without drawfs is unchanged. When the cursor is held against an edge, payload dx/dy now report 0 in that direction instead of the unrealised raw delta.

ADR 0012 D.3 description amended to reflect the new behaviour. D_VERIFICATION.md D.3 manual checklist updated with a post-clamp-delta verification step. Compositor consumers that read absolute (x, y) — including the current production semadrawd — are unaffected; the bug only manifested for consumers that integrated (dx, dy). Effect on AD-2a Phase 1 verification capture: the original dy = -N while y = 0 sequence at the top wall now reports dy = 0.

Sub-item: HUP_DIGITIZERS parser for Win8+ multi-touch (Medium-Large)

Status: Done 2026-05-06. Scoped 2026-05-05; hardware identified, characterized, descriptor decoded, ADR written (ADR 0018 plus 2026-05-05 amendment correcting Touchpad Mode mechanism); classifier extension, locator, parser, and Touchpad Mode feature-report send all implemented and verified end-to-end on bare metal against the HAILUCK 0x258a:0x000c trackpad.

Verification evidence: session 2026-05-06 on pgsd-bare-metal-test-machine. dmesg attach line for the HAILUCK trackpad (inputfs2) shows the full chain: pointer locations cached, digitizer locations cached (report_id=7, all eight fields present, x_range=[0..1535] y_range=[0..1023]), roles=pointer,touch, and "Device Mode set to MT Touchpad (report_id=11 rlen=2)". inputdump events captured the touch lifecycle for several gestures including single-finger drag (one type1, ~110 type2s, one type3), two-finger drag (interleaved per-contact-id type1/ type3 events), and brief taps. Per-contact tracking works, per-Q1 per-report emission works, per-Q2 confidence-low handling didn't trip on any input. The Phase 2.5 verification status doc section 7 deferral entry closes on the strength of this evidence.

Why this exists: scenarios 7-9 of semadraw/docs/PHASE_2_5_VERIFICATION.md (pinch, two-finger scroll, three-finger swipe) require SOURCE_TOUCH events with contact tracking. inputfs's existing pointer-locate path (Stage D.0a) walks past HUP_DIGITIZERS collections because no parser for that usage page exists yet. The deferred multi-touch verification BACKLOG entry under AD-2a points here as its blocker.

Hardware target: HAILUCK USB touchpad (vendor=0x258a, product=0x000c) on pgsd-bare-metal-test-machine. The device's 505-byte HID report descriptor was captured 2026-05-05 via the hw.inputfs.debug_descriptor sysctl introduced in commit 41e8f74. Decoded byte-by-byte, the descriptor contains:

  • Report ID 1: legacy 5-button mouse fallback (HUP_GENERIC_DESKTOP, what inputfs currently parses).
  • Report ID 7: full multi-touch digitizer (HUP_DIGITIZERS, usage 0x05 Touch Pad), with a Finger collection (usage 0x22) containing Tip Switch (0x42), Confidence (0x47), Contact Identifier (0x51, 3-bit, supports up to 8 contacts), X (15-bit, physical max 800 in cm units), Y (15-bit, physical max 600), Scan Time (0x56), Contact Count (0x54), and Button 1 from HUP_BUTTON for the clickpad button.
  • Report IDs 2, 3, 9: system / consumer / wireless control buttons (top-row keys, brightness, volume, WLAN radio).
  • Report IDs 5, 6, 10: vendor-defined feature reports (firmware configuration, opaque to the parser).
  • Report IDs 8, 11, 12, 13: Win8+ touchpad capability and configuration feature reports (Contact Count Maximum 0x55, Pad Type 0x59, Device Configuration, Surface Switch + Button Switch, Latency Mode 0x60).

This is the same descriptor shape Linux's hid-multitouch and FreeBSD's wmt(4) driver target.

Critical subtlety: device starts in Mouse Mode. Win8+ touchpads emit either Report ID 1 (mouse mode) or Report ID 7 (touchpad mode) depending on the Surface Switch + Button Switch bits in Report ID 12 feature output. By default the device is in Mouse Mode and emits only Report ID 1. inputfs must send a feature report at attach time with both bits set to switch the device into Touchpad Mode. Without this, even with the parser implemented, the device will continue to emit only mouse-class reports.

Implementation plan (roughly one week of kernel work plus testing iterations):

  1. Walk the rest of the descriptor. Done 2026-05-05. Full descriptor walk captured in inputfs/docs/adr/0018-hup-digitizers-parser.md section 1 (verified descriptor structure for HAILUCK 0x258a:0x000c). Two findings shape subsequent steps: (a) the descriptor declares one Finger collection per Report ID 7, so the parser emits one event per arrival ("hybrid mode" pattern from the Microsoft Precision Touchpad spec); (b) the device starts in Mouse Mode by default and emits Report ID 7 only after the host writes Device Mode = 0x03 (Multi-touch Touchpad) to the Device Mode feature field, which on the HAILUCK lives in Report ID 11. Both findings are documented in ADR 0018 sections 2 and 3, with the ADR amendment of 2026-05-05 correcting an earlier misidentification of the mode-switch mechanism. Report ID 7 layout (bit offsets and field sizes) is in section 1; Report ID 11 layout and Device Mode value table is in section 3.

  2. Add HUP_DIGITIZERS classifier role. Done 2026-05-06 (commit 9bb35ff). Extended the Stage B.5 role bitmask with a second pass over the descriptor's Application Collections looking for HUP_DIGITIZERS Touch Pad / Touch Screen / Pen usages. Devices presenting a Touch Pad collection gain INPUTFS_ROLE_TOUCH; devices presenting a Pen collection gain INPUTFS_ROLE_PEN. The matched-TLC first pass remains unchanged so devices presenting only one Application Collection classify identically to before. The HAILUCK trackpad now classifies as roles=pointer,touch instead of just roles=pointer.

  3. Add inputfs_digitizer_locate analog of inputfs_pointer_locate. Done 2026-05-06 (commit 9bb35ff). The locator pins down the digitizer's report ID by locating Tip Switch first (HUP_DIGITIZERS-specific, only appears in the digitizer collection), then iterates index = 0, 1, 2, ... for Generic Desktop X / Y / Button 1 to skip past the Mouse-fallback occurrences and find the digitizer's. All eight fields plus X/Y logical ranges populate the parser_state cache. Verified bare-metal: every bit offset and logical range matches the descriptor decode in ADR 0018 section 1.

  4. Implement the per-Report-ID-7 parser that emits touch.touch_down / touch.touch_move / touch.touch_up events per shared/INPUT_EVENTS.md. Done 2026-05-06 (commit fb018da). Mapping landed:

    • Tip switch rising edge for a contact ID → touch.touch_down with that contact ID.
    • Subsequent reports with same contact ID and tip switch high → touch.touch_move.
    • Tip switch falling edge → touch.touch_up.
    • Confidence low → treated as tip_switch=0 (synthesises touch_up if mid-contact, suppresses touch_down on new low-confidence contact). Recorded as Q2 design choice.
    • Scan time extracted but not surfaced; ts_ordering uses kernel monotonic, matching the pointer/keyboard paths.
    • Button 1 → emitted via INPUTFS_SOURCE_POINTER with the same payload layout the Mouse-fallback Report-ID-1 path uses, so consumers see one button_down/up per click regardless of which mode the device is in. Position attached to the button event uses the most recent active contact's last pixel position.
    • Per-report emission, no frame batching (Q1 design choice). Each Report ID 7 arrival produces exactly one event for one contact; frame structure is opaque to consumers.

    Verified bare-metal across multiple gestures: single-finger drag produced 1 type1 + ~110 type2

    • 1 type3 over ~900ms; two-finger drag produced interleaved per-contact-id type1/type3 cycles with continuous type2 emission; brief taps produced cleanly bounded type1/type2/type3 sequences. Timestamps monotonic. No crashes across hundreds of frames.
  5. Implement the feature-report send at attach time to switch the device into Touchpad Mode. Done 2026-05-06 (commit fb018da). The original plan had this targeting Report ID 12 (Surface Switch + Button Switch) but the ADR amendment of 2026-05-05 (commit 6386360) corrected this after review of FreeBSD's wmt(4) and hmt(4) sources: the load-bearing field is Device Mode (HUP_DIGITIZERS, HUD_INPUT_MODE, usage 0x52) which on the HAILUCK lives in Report ID 11. Setting Device Mode = 0x03 (Multi-touch Touchpad) enables Report ID 7 emission. Surface Switch and Button Switch are secondary controls that default to enabled; inputfs leaves them at defaults.

    The setter writes a full-rlen buffer (rlen from hid_report_size, including the report-ID byte) with memset + buf[0] = report_id + hid_put_udata + hid_set_report(dev, buf, rlen, HID_FEATURE_REPORT, report_id). This pattern handles devices where Device Mode is packed into a larger configuration report alongside other fields without clobbering them.

    Failure is non-fatal: warning logged, attach proceeds, device stays in Mouse Mode. Verified bare-metal: dmesg attach line shows "Device Mode set to MT Touchpad (report_id=11 rlen=2)" — the SET_REPORT succeeded and the HAILUCK started emitting Report ID 7 immediately.

  6. Verify scenarios 7-9 of the runbook end-to-end on the HAILUCK trackpad. Done 2026-05-06. Operator session on pgsd-bare-metal-test-machine captured dmesg attach output (full chain present) plus inputdump traces for several gestures. The traces show correct contact lifecycle, per-contact tracking across overlapping gestures, monotonic timestamps, and per-report event emission at the ~125 Hz rate the descriptor declares. Phase 2.5 status doc section 7 updated to "Verified" with this evidence; AD-2a Phase 2.5 multi-touch deferral entry closed.

    One small follow-up surfaced: inputdump's pretty-printer renders touch events as touch.type1 / type2 / type3 rather than touch_down / touch_move / touch_up. Cosmetic, tracked separately as an inputdump symbol-table fix. The integrated clickpad button transitions weren't exercised in this verification session; a follow-up gesture test will cover that path.

Documentation: ADR 0018 (HUP_DIGITIZERS parser design) plus its 2026-05-05 amendment captures the Mouse-Mode-vs-Touchpad-Mode subtlety, the Device Mode feature-report send at attach (Report ID 11, value 0x03 — not Surface/Button Switch as the original ADR draft mistakenly said; corrected after review of FreeBSD's wmt(4) and hmt(4) sources), the contact-ID lifecycle mapping to touch events, the exclusive-HID-consumer architectural invariant on which the single-attach pattern depends, the Q1 per-report emission policy, and the Q2 confidence- low-as-tip-switch=0 policy.

Out of scope for v1: pen events (HUP_DIGITIZERS usage 0x02), in-range without tip-switch (hover), pressure, contact area. These map to additional pen.* event types in shared/INPUT_EVENTS.md and are tracked under AD-1's pen support sub-item, not this one. Touch-screen variant (usage 0x04) is also out of scope; the work is structurally similar but requires touchscreen hardware in the lab.

Effect on AD-2a: Closed 2026-05-06. The deferred Phase 2.5 multi-touch verification entry no longer points to this sub-item as a blocker; it points to the bare-metal verification evidence above. Phase 2.5 closes on scenarios 1-6 independently; this sub-item allowed the deferred 7-9 to be verified. Phase 3 (deletions) is not affected.

Sub-item: Apple multi-touch trackpad support (Medium-Large)

Status: open, parallel future work; not gating Phase 2.5 multi-touch verification.

Why this is separate: Apple's Magic Trackpad family (Magic Trackpad 1/2/3) does not use Microsoft Win8+ HUP_DIGITIZERS. It uses a vendor-specific multi-touch protocol with custom report descriptors, Apple-defined finger structures (4D coordinates including pressure and finger angle), and a vendor-specific feature-report write to enable raw multi-touch reporting (without which the trackpad emits a basic mouse-class report only). FreeBSD has limited prior-art for this hardware; the relevant reference implementations are Linux's hid-magicmouse driver and the older bcm5974 driver for built-in MacBook trackpads.

Hardware target: an Apple Bluetooth Magic Trackpad is available in the lab but is not currently paired with pgsd-bare-metal-test-machine. Pairing logistics on FreeBSD (hcsecd configuration, link-key persistence, reconnection handling) are part of this sub-item's scope.

Implementation outline (rough; finer scoping deferred until prior-art reading is done):

  1. Pair the trackpad with the test machine; confirm it attaches to inputfs via the Broadcom Bluetooth stack and capture its HID descriptor via the existing hw.inputfs.debug_descriptor sysctl.
  2. Identify which Apple multi-touch protocol variant the device uses (Magic Trackpad 2/3 are different; Bluetooth-vs-USB also differs).
  3. Implement the vendor-specific feature-report write to enter raw multi-touch mode.
  4. Implement an Apple-protocol parser variant alongside the HUP_DIGITIZERS parser, sharing the same inputfs_digitizer_* infrastructure for event emission.
  5. Verify scenarios 7-9 of the runbook on the Apple trackpad.

Dependency on the HUP_DIGITIZERS sub-item: the shared digitizer infrastructure (classifier role, event emission paths, contact-ID lifecycle handling) should land first via the HUP_DIGITIZERS sub-item. This sub-item then adds an Apple-protocol recogniser that plugs into that infrastructure rather than duplicating it.

Effect on AD-2a: none. The Phase 2.5 multi-touch deferral entry closes on the HUP_DIGITIZERS sub-item against the HAILUCK trackpad. This sub-item is parallel future work that adds support for a second hardware family.

Sub-item: pollable-fd / kqfilter for the events ring (Small-Medium)

Status: superseded by AD-41.3 / ADR 0021, 2026-05-27 evening. The latency improvement this sub-item describes was delivered by AD-41.3 via a different mechanism than this sub-item proposed: a separate notification character device, /dev/inputfs_notify, designed by ADR 0021 (Proposed 2026-05-25, implementation landed 2026-05-26/27). The notify cdev provides full d_poll + d_kqfilter support (verified in inputfs/sys/dev/inputfs/inputfs.c:1556-1684). semadrawd's main loop adds the notify fd to its poll set; when inputfs publishes an event, the notify fd becomes readable and the daemon wakes. This achieves the same latency benefit this sub-item described, while keeping the data plane (mmap of /var/run/sema/input/events) decoupled from the notification plane.

Closing this sub-item rather than implementing it: ADR 0021's deliberate design choice was to use a separate notification surface rather than fold kqfilter onto the data device, on the reasoning that the data plane and notification plane have different lifecycle and access patterns. Adding d_kqfilter to the /dev/inputfs cdev as well would duplicate the capability without a clear consumer; nothing in the codebase opens /dev/inputfs and waits on it. If a future consumer needs a single fd that combines data and notification, that would be a fresh design decision (and likely a fresh ADR), not a return to this sub-item.

Previous content (preserved for the record of what this sub-item was about before AD-41.3 superseded it):

Userspace consumers (inputdump, semadrawd's drain loop) poll the ring at an interval rather than blocking on event arrival. The poll-and-drain pattern absorbs events fine in practice with no measurable drop, so this is a latency improvement rather than a correctness gate. Plan: add d_kqfilter to the inputfs cdevsw, implement EVFILT_READ against the events ring's write index, fire knote on inputfs_event_emit completion. Half a day to a day of kernel work plus the inputdump CLI flag.

Effect on AD-2: none. AD-2 closed 2026-05-17, independently of this sub-item.

[~] AD-3: Audio output: replace OSS dependency (In progress, Large; sequenced under ADR 0008 with F-stage reconciliation in ADR 0011 (2026-05-28); chipset scope decided and codec enumeration discharged for confirmed target pgsd-bare-metal; commit-6.x series landed 2026-05-21 with audible output verified on pgsd-bare-metal iMac internal speaker; audit-as-gate retired by ADR 0010 (2026-05-27 evening); F.1 (state file) bench-verified [x] on pgsd-bare-metal 2026-05-28; F.2 (events ring) bench-verified [x] on pgsd-bare-metal 2026-05-28; F.3.a (continuous streaming) bench-verified [x] on pgsd-bare-metal 2026-05-29; F.3.b (user-controlled playback) bench-verified [x] on pgsd-bare-metal 2026-05-30; F.3.c (interrupt-driven position tracking) bench-verified [x] on pgsd-bare-metal 2026-05-31; F.3.d (xrun detection) bench-verified [x] on pgsd-bare-metal 2026-05-30 (per ADR 0017 with two post-bench amendments: detection point moved to user-ring shortfall at audiofs.c line 4353, and coalescing reframed as opportunistic); F.4 (clock writer) bench-verified [x] on pgsd-bare-metal 2026-06-01 (ADR 0018 Accepted: kernel becomes the clock writer via a wired shared mapping of /var/run/sema/clock; monotonic across stop-start verified, no leak across 70 kldload/kldunload cycles); F.3.e (format negotiation) bench-verified [x] on pgsd-bare-metal 2026-06-01 (ADR 0019 Accepted: rate-only negotiation 32k/44.1k/48k, 16-bit stereo fixed, native-only per ADR 0007; GET/SET_FORMAT ioctls on /dev/audiofs0; SET reconfigures the running stream and emits format_change; F.4 republishes the negotiated rate); F.3.f (HDMI bring-up) deferred 2026-06-01 (blocked on a UTF-provided display capability; see the F.3.f deferral note below); audiofs output DMA-boundary hum found during F.5.a bench and fixed 2026-06-02 (ADR 0022 localized it below the software path via refill-miss instrumentation and a byte-exact capture fork, then ADR 0023 resolved it: per-fragment interrupt servicing on a slack-free 2-entry DMA ring, fixed by deepening the ring to 4 entries/16KB after a depth sweep and under-load test; refill-miss counters retained as permanent observability); F.5.a (semasound mixer core) bench-verified [x] on pgsd-bare-metal 2026-06-02 (ADR 0021 Accepted and closed: Unix-socket broker, sole writer to /dev/audiofs0, sum-clip-zerofill mixer paced by blocking-write backpressure, reader-thread-per-client plus single mixer/output thread, xrun consumer polling the F.2 notify/events ring; all 11 criteria passed including multi-client mix, non-canonical rejection, stall and disconnect isolation, induced-xrun observe-and-continue, and ADR 0018 clock stop-start monotonicity; the ADR 0023 audiofs fix was a prerequisite for the clean mix), F.5.b-f (format adaptation, targets, policy, state publication, supervision) still owed, F.6 (semaaud retirement), and maintenance model still owed)

Tracks: audiofs/docs/audiofs-proposal.md (Stage F).

F.3.f (HDMI bring-up) deferral (2026-06-01). HDMI audio is not self-contained on the HDA codec. The HDMI/DisplayPort audio codec is the GPU's HDA function, separate from the analog codec, and it only reports pin presence, exposes an ELD, and clocks audio once the GPU display side has detected the sink, programmed the mode/transcoder/port, lit the link, and enabled the audio path, including writing the sink's ELD into the codec. That coordination is the role drm-kmod fills. PGSD will not use drm-kmod; the equivalent display/modeset and HDMI-audio-enable capability is to be provided within UTF and does not yet exist. Until it does, audiofs has no powered or populated HDMI codec to act on, so the ADR 0011 F.3.f scope (presence detection, audio infoframes, stream verification) cannot be implemented or bench-verified. F.3.f is therefore deferred, blocked on that UTF-provided display capability, not on hardware and not on audiofs. It is off the AD-3 critical path: ADR 0011 makes F.3.f parallel to F.3.a-e, and F.5 depends on F.3.b-e (all closed), so AD-3 proceeds via F.5 then F.6 without it. When unblocked, verification moves to a laptop with a working HDMI/DP output and an audio sink; pgsd-bare-metal (the iMac) cannot drive HDMI audio end to end. This dependency is distinct from DF-6 (drawfs DRM-backend wiring), which targets the existing drm-kmod KMS path; the no-drm-kmod direction means the UTF-native display capability F.3.f waits on is a separate, yet-to-be-scoped effort.

semaaud currently uses OSS (FreeBSD's kernel audio framework) for audio output. OSS is accepted as platform transport today (docs/UTF_ARCHITECTURAL_DISCIPLINE.md). Direct hardware driving, analogous to how inputfs replaces evdev, would remove this dependency entirely.

The native substrate is named audiofs on the kernel side and semasound on the userland side, mirroring inputfs / semainput. audiofs is a direct PCI driver that class-matches on PCI HDA controllers (class MULTIMEDIA, subclass MULTIMEDIA_HDA per the PCI spec). The match is vendor-agnostic: any controller the PCI spec calls HDA-class is in scope, regardless of silicon vendor. On pgsd-bare-metal this includes both the Intel Sunrise Point analog HDA controller and the ATI Oland HDMI audio controller; the same audiofs binary attaches to both. Other HDA controllers (NVIDIA HDMI, VIA, SiS, ULI, and the rest of the Intel and AMD lineages) would also attach by the same class probe, but only Intel and AMD/ATI are observationally verified today. "Class-matched" and "verified on every controller in the class" are deliberately separate claims; the latter accumulates as bench evidence with real hardware. The snd(4) framework is removed from the PGSD kernel in full: the generic sound shim, the snd_hda driver, and the other in-tree snd drivers (cmi, csa, emu10kx, es137x, ich, via8233) are all absent. snd_hda had to go for audiofs to take the codec-attach slot without a name-locked binding from snd_hda's hdacc children intercepting it; the others are removed under the broader principle that if PGSD does not target the hardware, the kernel does not compile in the driver (consistent with AD-8 HID exclusions and AD-39 console-driver removals). audiofs uses dev/sound/pci/hda/hda_reg.h and hdac_reg.h as header-only sources of HDA register definitions; there is no runtime dependency on snd(4) code, and with the framework removed there is no /dev/dsp* surface for anything to fall back to. The removal is documented in-situ at pgsd-kernel/PGSD (sound section) and propagated through WITHOUT_MODULES_NAMES in pgsd-kernel/pgsd-kernel-build.sh. audiofs publishes /var/run/sema/audio/{state,events} and takes over clock-writing duty from semaaud (the kernel knows the actual sample position more accurately than userland readback). semasound, when implemented, will talk to those files directly. semasound inherits semaaud's durable-policy work (Phase 12), named-target topology, mixer logic, control socket, and runtime UI state, but talks to audiofs instead of /dev/dsp*. semaaud retires once semasound is verified end-to-end (analogous to AD-2 for semainput).

This is substantial work. Real-time audio has harder timing constraints than input (buffer underrun is immediately audible), vendor-specific audio hardware programming is complex, and the existing OSS interface is reasonably stable. The proposal landed 2026-04-29 (commit 88b9405) and identifies six open architectural questions that subsequent ADRs will resolve before any kernel code is written: Q1 data path (tmpfs ring vs kernel-mapped DMA vs hybrid), Q2 mixer location, Q3 OSS coexistence model, Q4 format negotiation, Q5 latency targets, and Q6 serialization format for semasound's userland surfaces. The pre-survey BACKLOG entry counted only four; Q5 (latency) and Q6 (serialization) were added to the proposal during review and the BACKLOG entry is corrected here.

Stage F.0 (architectural ADRs) is in progress under audiofs/docs/adr/. ADR 0001 establishes the per-question ADR structure; ADR 0002 resolves Q3 (OSS coexistence) with end-state Exclusive, migration-time per-device sysctl assignment, Layered rejected. Q1, Q2, Q4, Q5, and Q6 remain open as of this commit.

Implementation (Stage F.1 onward) depends on AD-2 closing first and on F.0's six ADRs being accepted. F.0 ADR work itself is documentation, not implementation, and can proceed in parallel with AD-2 thinking.

Status update 2026-05-17 (supersedes the F.0-in-progress text above; original retained as the record of where F.0 stood mid-stream). F.0 is complete. The architectural ADR set is 0001-0007, all Accepted: 0002 (OSS coexistence), 0003 (clock writer, plus section 8 the snd(4)-dependency analysis), 0004 (mixer location), 0005 (userland architecture), 0006 (replace snd(4) in full, with governance independence as the primary recorded rationale), 0007 (the physics/semantics boundary governing audiofs content). The work exceeded the original six-question frame: 0006 and 0007 are architectural decisions beyond the Q1-Q6 list. AD-2 has closed (see AD-2 entry). The governance-independence rationale and its inputfs/hms(4) precedent are recorded in docs/UTF_ARCHITECTURAL_DISCIPLINE.md. The premise- validation method is specified (not performed) in audiofs/docs/snd4-gap-governance-audit.md.

AD-3 is now sequenced and gated by ADR 0008 (Stage F scope). It is still Open and not started, and deliberately so: ADR 0008 makes the gap-and-governance audit a strict gate that runs before any audiofs code, and makes two scope inputs (the target chipset list and the maintenance model) owed, gating preconditions that ADR 0008 explicitly does not invent. Scheduling a start date is a separate act that requires those owed inputs and the audit to exist first. The Q1 data-path question remains the one F.0-era question without its own ADR; it is folded into F.3's own sub-stage ADR per ADR 0008 section 5 rather than left as a standalone F.0 gap.

Status update 2026-05-17 (chipset scope partially discharged). ADR 0008 section 3a now records the chipset scope rule as decided (HDA-class and USB-audio-class, 2016+, confirmed-target machines) and pins the controller-level targets from real pgsd-dev pciconf/usbconfig output (Intel Comet Lake PCH HDA; NVIDIA GP108 HDMI; Logitech G433 USB-audio). The owed item is narrowed: not "produce the chipset list" but the specific action of recording hdacc/hdaa codec identities via cat /dev/sndstat and verbose dmesg on confirmed-target machines. The maintenance model remains fully owed and unchanged. HDMI audio is in scope at full guarantee; its dependency on third-party GPU/display infrastructure is recorded in ADR 0008 section 3a as transitional (not an accepted architectural exception), with a flagged future-overlap note that full HDMI guarantee may require UTF-owned display/GPU work overlapping AD-4, the extent of which is deferred for future analysis and explicitly not declared a merged AD-3/AD-4 scope here.

Status update 2026-05-17 (machine correction; codec discharged; regime and platform questions open). The preceding status block recorded controller-level targets from pgsd-dev. That was the wrong machine: pgsd-dev is not the AD-3 confirmed target. ADR 0008 section 3a is superseded accordingly (not amended): the confirmed target is pgsd-bare-metal (the PGSD bare-metal machine, Apple/Skylake, where Phase 2.5 input verification ran). Its codec enumeration is now discharged from the real machine's dmesg: primary analog is a Cirrus Logic CS4206 (classic HDA; pcm6 internal speaker, pcm7 headphones, pcm8 digital), and the HDMI path is an ATI R6xx HDA codec on an ATI Oland controller (pcm0-pcm5), AMD/ATI not NVIDIA. No USB audio device is present on the confirmed target. F.scope.a is therefore discharged for pgsd-bare-metal; it re-opens per the scope rule only for any future confirmed-target machine. The maintenance model remains fully owed and unchanged. Two questions are recorded in 3a as explicitly open, deliberately not decided as ADRs: (1) ADR 0006's ownership strategy is established only for the classic-HDA regime; its applicability to post-HDA SST/cAVS/SoundWire/SOF firmware-pipeline hardware is an unresolved risk gating any extend-as-needed step into non-classic-HDA hardware; (2) whether PGSD should be designed for a specific hardware platform (e.g. PCIe-slot desktop/server only, in the manner Apple targets controlled hardware) is a foundational project-identity question under deliberate consideration by the decision owner, not settled here and deliberately not recorded as an ADR to avoid a platform-class commitment being made as a side effect of the audio thread.

Status update 2026-05-20 (experimental implementation started; gate posture corrected). Implementation of audiofs as a FreeBSD PCI driver for HDA controllers began this date on pgsd-bare-metal, with explicit decision-owner ratification. The work crosses what ADR 0008 framed as a gate against "any audiofs code before the audit clears" and the BACKLOG previously framed as "not started"; both framings are corrected here rather than papered over.

The correction is doctrinal, not casual: discipline bites on decay of provisional state into silent canonical truth, not on the existence of provisional implementation work itself. Running code that knows itself to be experimental, labeled as such, accepted by an informed decision owner as such, and producing empirical evidence that informs (does not pre-empt) audit and ratification, is a legitimate mode of substrate research. The handoff packet for independent verification of the original audit surface remains frozen and available; this implementation work runs in parallel with that, not in place of it.

What landed today as audiofs/sys/dev/audiofs/audiofs.c (478 lines, em-dash-clean, no normative-status assertions):

  • Attaches as a PCI class driver (PCIC_MULTIMEDIA / PCIS_MULTIMEDIA_HDA) to HDA controllers. Verified attaching to both controllers on pgsd-bare-metal: audiofs0 on Intel Sunrise Point HDA (the CS4206 host), audiofs1 on ATI Oland HDA (HDMI).
  • Maps BAR0, reads HDA version and GCAP, logs capability counts via dmesg and a per-instance sysctl-readable eventlog ring.
  • Resets the controller per HDA 1.0a section 4.3, with the reset sequence mirroring hdac.c's hdac_reset() verbatim (no fabrication; source-grounded against releng/15.0 sys/dev/sound/pci/hda/hdac.c).
  • Exposes dev.audiofs.N.{num_iss,num_oss,num_bss, support_64bit,pci_vendor,pci_device,eventlog} sysctls.

Empirical findings from first contact, recorded here as the substrate-evidence stream the gate ultimately rests on:

  • audiofs0 (Intel/Cirrus host): HDA v1.0, GCAP=0x9701, ISS=7 OSS=9 BSS=0, 64-bit DMA supported.
  • audiofs1 (ATI Oland HDMI): HDA v1.0, GCAP varies between loads (0x0000 first cold load, 0x6003 on reload). ISS=0 OSS=6 BSS=0 on the working read, 64-bit supported. Power- state interaction with the GPU power well is the most likely cause; deferred for later investigation.
  • Both controllers reset cleanly. No panics, no resource leaks across multiple load/unload cycles.

Out of scope for this commit (explicitly):

  • CORB/RIRB ring setup. No codec command dispatch yet; no codec enumeration, no widget walk, no pin configuration, no streaming, no CLOCK region writing, no AUDIO_STATE / AUDIO_EVENTS. These land in subsequent commits.
  • Power-state management for the ATI HDMI block. Cold-load GCAP=0 is a deferred investigation, not a regression.
  • snd(4) coexistence behavior. ADR 0002's posture is unaffected; this PCI driver claims the HDA controllers by BUS_PROBE_DEFAULT and snd_hda is no longer in the PGSD kernel (see PGSD kernel config commit, this date).

The PGSD kernel config (pgsd-kernel/PGSD) removed device snd_hda and added hda to WITHOUT_MODULES_NAMES in pgsd-kernel/pgsd-kernel-build.sh to suppress the module from the build. This was the precondition for audiofs taking the PCI HDA controllers directly without contesting hdac.c's name-locked codec-child probe. The maintenance-model owed item now includes responsibility for the kernel configuration delta this introduces.

The original audit specification (audiofs/docs/snd4-gap-governance-audit.md) and its handoff packet remain authoritative. Their relationship to this implementation work: the audit was specified to validate ADR 0006's premise that gaps in snd(4) substantively justify replacement. This implementation demonstrates that replacement is mechanically achievable at the controller layer; it does not retire the audit's question, which concerns whether the premise was correctly analyzed. The audit's verdict and this implementation's empirical evidence are both inputs to eventual ratification, not substitutes for each other.

Next concrete substrate steps (not constitutional acts):

  • Commit 2: CORB/RIRB rings, codec enumeration via STATESTS
    • GET_PARAMETER. Identifies Cirrus CS4206 at its codec address, ATI R6xx at its.
  • Commit 3: function-group walk, widget enumeration, pin configuration for analog headphone output on CS4206.
  • Commit 4+: stream descriptor allocation, DMA buffer setup, audible-output target.

CLOCK region writing (the audit's central concern) lands no earlier than commit 4 and remains explicitly provisional until ADR 0003's transition from semaaud to audiofs as canonical clock writer is independently verified.

Status update 2026-05-20 (commit 2: CORB/RIRB + codec enumeration). Codec layer first contact achieved through audiofs' own command path. Both HDA controllers on pgsd-bare-metal now respond to verb dispatch through audiofs-owned CORB/RIRB rings:

  • audiofs0 (Intel Sunrise Point host): Cirrus Logic CS4206 at cad=0, vendor=0x1013 device=0x4206 rev=0x03.02. This is the analog codec driving internal speaker, headphone jack, and S/PDIF.
  • audiofs1 (ATI Oland HDMI): ATI R6xx codec at cad=0, vendor=0x1002 device=0xaa01 rev=0x03.00. HDMI audio block.

CORB and RIRB both 256 entries, DMA-allocated through bus_dma_tag_create + bus_dmamem_alloc + bus_dmamap_load, with the same alignment and 64-bit coherent settings hdac(4) uses. Commands sent via CORB writes with bus_dmamap_sync PREWRITE/POSTWRITE; responses read by polling RIRB write pointer with PREREAD/POSTREAD sync. No interrupts, no unsolicited handling beyond logged-and-discarded.

This commit makes the substrate claim in ADR 0006 (audiofs owns the codec command path) operationally true. The specific call site the original audit cared about (hdaa_channel_getptr and its 128-byte alignment) is now reachable from audiofs's own code via HDA_CMD_GET_PARAMETER verb dispatch.

What this commit explicitly does not establish:

  • Function-group walk, widget enumeration, pin configuration. Those land in commit 3.
  • Stream descriptors, DMA buffers, audible output. Commit 4+.
  • CLOCK region writing. Still no earlier than commit 4, still gated on ADR 0003 transition verification.
  • Any retirement of the original audit. The audit's question (whether ADR 0006's premise was correctly analyzed) is independent of this implementation's mechanical correctness; both remain inputs to ratification.

Status update 2026-05-20 (commit 3: function-group walk and widget enumeration). Codec topology now visible through audiofs's own command path. For each populated codec on pgsd-bare-metal, audiofs queries sub-node counts, classifies function groups (audio vs modem), reads subsystem id, walks every widget under audio FGs, queries each widget's audio-widget-cap, and for pin complexes reads the configuration-default register.

Substantive cross-check against the previous snd_hda attach: subsystem ids match (CS4206 = 0x106b8200, ATI R6xx = 0x00aa0100), widget counts match (CS4206: 20 widgets at nid 2-21; ATI: 12 widgets at nid 2-13), and pin configuration defaults decode to the same role assignments snd_hda's pcm devices implied:

  • CS4206 nid=10 pin_cfg=0x002b4020 -> HP_Out, jack, front, black. The headphone jack (was pcm7 under snd_hda).
  • CS4206 nid=11 pin_cfg=0x90100112 -> Speaker, fixed, internal. The internal speaker (was pcm6 under snd_hda).
  • CS4206 nid=14 pin_cfg=0x90a60100 -> Mic_In, fixed, internal. The internal microphone.
  • CS4206 nid=16 pin_cfg=0x004be030 -> SPDIF_Out, jack, top. The optical S/PDIF (was pcm8 under snd_hda).
  • ATI R6xx nids 3,5,7,9,11,13: all pin_cfg=0x185600f0, i.e. Digital_Other_Out, all internal-special locations. The six HDMI streams (were pcm0-pcm5 under snd_hda).

audiofs's view of the hardware is now operationally identical to snd_hda's, through completely independent code: our CORB/RIRB rings, our verb dispatch, our parameter parsing. This is independent corroboration of the codec layer's content, not a confirmation that the content was correctly analyzed (which remains the audit's question).

Commit-3 scope deliberately stops at enumerate-and-log. What this commit does not establish:

  • Connection-list reads (which widgets feed which).
  • Amplifier capability and current state.
  • Stream format capability per converter.
  • Graph construction: a path from a DAC widget through any intermediate mixers/selectors to a pin. Commit 4.
  • Pin widget control (turning the headphone pin's output enable on; setting EAPD/HP-Amp-Enable). Commit 4.
  • Stream descriptor allocation, BDL setup, DMA buffer binding, audible output. Commit 5+.
  • CLOCK region writing. Still no earlier than commit 4, still gated on ADR 0003 transition verification.

Status update 2026-05-21 (commit 6e: audible test signal, spec-complete with empirical limit on this hardware). All spec-defined steps required to produce audible output have been implemented and verified by hardware readback. The analog signal nevertheless does not reach the speaker on the pgsd-bare-metal iMac. The remaining gap is vendor- specific (Apple iMac downstream Class-D amplifier enable, not surfaced through HDA spec verbs) and is deliberately left for separate work so audiofs's spec-defined surface stays clean and small.

What landed in this commit:

Audio buffer: - Replaced the zero-filled buffer with a precomputed 750 Hz sine wave table (64 samples per period at 48 kHz, amplitude 16384, 32 complete periods per 8 KB buffer for seamless loop). - bus_dmamap_sync(PREWRITE) on the buffer after fill. - audiofs_sine_table[] at file scope, static const.

Output selection priority: - Replaced "first matching pin wins" with a priority table favoring outputs a developer is most likely to hear at their desk: Speaker (10) > HP_Out (5) > Line_Out (3) > Digital_Other_Out (2) > SPDIF_Out (1) > CD (0). - This is the attach-time test-signal policy. Real output policy with jack-presence detection is separate.

Power state management: - audiofs_power_up_widget: sends SET_POWER_STATE(D0) via verb 0x705, then polls GET_POWER_STATE until ACT = D0 with a 100 ms timeout. HDA widgets with the POWER_CTRL bit in their wcap come out of reset in D3 (sleep); D0 is required for them to process audio. - audiofs_power_up_codec_paths: powers up the FG and every widget on every discovered output path. - New pass in audiofs_walk_topology between path discovery and pin enable.

LPIB sampling extended: - AUDIOFS_LPIB_SAMPLES bumped 5 -> 30 (290 ms run total, audibly long if the analog stage emits a signal). - dmesg prints every 10th sample plus the last to keep the log readable; eventlog still has every sample for the empirical record.

Verified on pgsd-bare-metal at the spec-defined surface:

Audiofs0 (CS4206): DAC nid=4 (Speaker) selected (priority 10). Power states transitioned to D0: nid=10 (pin HP) ACT=D0 SET=D0 nid=3 (DAC HP) ACT=D0 SET=D0 nid=4 (DAC Speaker) ACT=D0 SET=D0 nid=8 (DAC SPDIF) ACT=D0 SET=D0 Pin enable, amp unmute, DAC format binding, stream descriptor, BDL, sine fill, DAC stream bind, RUN bit - all writes confirmed by readback. LPIB advances at exactly 192000 bytes/sec across 290 ms run, wrapping the 8 KB buffer cleanly.

Audiofs1 (ATI Oland HDMI): Same sequence on DAC nid=2. All registers verified. LPIB advancement varies by run (see earlier observations).

What this commit does not achieve:

Audible signal on the pgsd-bare-metal iMac's internal speaker. Honest assessment: every HDA-spec-defined step in the analog path is performed and verified, but no sound emerges from the speaker. The remaining gap is almost certainly the downstream Class-D amplifier enable, which on Apple hardware is asserted via a codec GPIO that requires vendor-quirk knowledge to address.

This is a substantive limit, not a coding error. The exact bring-up audiofs performs (pin OUT_ENABLE, amp unmute at OFFSET gain, format 0x0011, stream tag bind, RUN) is sufficient to drive a discrete codec on a board whose analog output is not gated by an external amp enable. The iMac's internal speaker happens to be such a gated output.

Documenting this here so a future reader sees the limit clearly. ADR 0006's hypothesis (a clean reimplementation can match the spec-defined path) remains supported. The vendor-quirk surface is separate work, accumulating as needed.

Out of scope, deferred:

  • Apple iMac Class-D amplifier enable via codec GPIO. Would need consultation of Linux's patch_cirrus.c quirks for CS4206 on subsystem 0x106b8200, or Cirrus data sheet for the same.
  • Per-codec quirk infrastructure generally. The clean way to integrate this is a small table keyed on (vendor_id, device_id, subsystem_id) producing callbacks that run after the spec-defined bring-up.
  • HDMI presence detection (HDMI's intermittent LPIB behavior remains a separate question).
  • Continuous streaming, ioctl/sysctl playback control, multiple concurrent streams.
  • Interrupt-driven position tracking.
  • Format negotiation beyond fixed 48k/16/stereo.
  • CLOCK region writing.

This concludes the commit-6.x series. audiofs implements the full HDA-spec-defined controller and codec bring-up for output, end-to-end from PCI attach to DMA-driven position advancement, in around 3100 lines of code with no dependency on snd_hda or its sub-layers.

Status update 2026-05-21 (commit 6f: platform-policy diagnostic, empirical finding documented). The "honest limit" of commit 6e turned out to be more precisely characterized once the HDA spec's GPIO surface (sections 7.3.3.22-27) and the codec's GP I/O Count parameter (section 7.3.4.14) were inspected. Result: the gap between commit 6e and audible output on the iMac speaker sits inside the standard verb surface, not outside it.

What landed in this commit:

Platform-policy diagnostic pass at attach: - GP I/O Count query (parameter 0x11) at each populated codec's audio function group; logs NumGPIOs / NumGPOs / NumGPIs and the Wake / Unsol capability flags. - Pin EAPD_CAP enumeration (already-stored pin_cap walked, EAPD_CAP bit reported per pin). - If a codec advertises any GPIO lines AND no platform codec has been adopted yet, that codec becomes the "platform codec" for runtime GPIO control: SET_GPIO_ENABLE_MASK, SET_GPIO_DIRECTION (all outputs), SET_GPIO_DATA=0 are issued at attach. data=0 is a safe default (active-high amp gates leave amps powered down).

Runtime controls: - dev.audiofs.N.gpio_data (read/write int). Writes drive SET_GPIO_DATA on the platform codec's FG nid; reads return the last-written value. ENXIO if no platform codec was adopted. - dev.audiofs.N.play_test_tone (read/write int). Any write re-runs audiofs_run_output_stream (DAC bind + RUN + 290 ms LPIB poll + clear) so the empirical sweep for "which GPIO bit enables the speaker amp" can happen without unloading the module.

Eventlog entries added: gpio_cap, gpio_enable_mask_set, gpio_direction_set, gpio_data_init, gpio_data_set, pin_eapd_cap, play_test_tone_req.

Empirical finding (pgsd-bare-metal, iMac, CS4206 codec, PCI subsystem 0x106b8200):

audiofs0 GPIO inventory: GPIO=4 GPO=0 GPI=0
audiofs0 pins with EAPD_CAP: none
audiofs1 (ATI HDMI) GPIO inventory: GPIO=0

Runtime sweep results (gpio_data value -> tone):

0x00 -> silent
0x01 -> silent       (bit 0 alone)
0x02 -> silent       (bit 1 alone)
0x04 -> silent       (bit 2 alone)
0x08 -> AUDIBLE      (bit 3 alone)
0x09 -> AUDIBLE      (bits 0+3)
0x0a -> AUDIBLE      (bits 1+3)
0x0c -> AUDIBLE      (bits 2+3)
0x0f -> AUDIBLE      (all four bits)
0x00 -> silent       (confirms bit 3 gates, doesn't latch)

Conclusion: GPIO bit 3 on the CS4206 (subsystem 0x106b8200) enables the iMac's internal Class-D speaker amplifier, active-high. Other GPIO bits have no observable effect on amp state. The mechanism is fully spec-defined (SET_GPIO_DATA verb 0x715 at the FG nid); the policy (which bit, which subsystem) is board-specific.

Architectural framing:

  • audiofs core (this commit) implements the standard HDA GPIO control surface. No vendor-specific verbs.
  • Platform policy (commit 6g) will codify the empirical finding as a small data table keyed on PCI subsystem ID producing an initial gpio_data value. On no match, gpio_data stays at 0 (safe).

This shape replaces the "honest limit" framing of commit 6e: the limit was not "outside the HDA spec" but rather "inside the standard verb surface, outside generic autodiscovery." Inspection of capability registers gave us a controllable surface; empirical sweep through standard verbs gave us the policy.

Out of scope, deferred to commit 6g:

  • Automatic gpio_data assertion based on PCI subsys.
  • The policy table itself, with the Apple iMac entry.

Status update 2026-05-21 (commit 6g: platform-policy table, iMac speaker enabled automatically). Adds the small data table that codifies the empirical finding from commit 6f as a (PCI subvendor, PCI subdevice) -> initial gpio_data mapping. With this commit loaded on the pgsd-bare-metal iMac, the audible test signal at attach plays through the internal speaker without operator action; on hardware with no matching entry, gpio_data stays 0 (safe) and the runtime sysctls remain available for empirical investigation.

Entries:

Apple iMac (subvendor 0x106b, subdevice 0x8200, gpio_data=0x08). Comment in source cites the empirical sweep documented in commit 6f.

The table is data with comments, not vendor-quirk code. Adding hardware requires adding one row plus a one-line comment pointing to the commit message that contains the empirical evidence.

The HDA-spec-defined surface (audiofs core) stays unchanged: SET_GPIO_DATA verb at the audio function group node, identical for every codec. Per-board policy lives in the table.

This concludes the commit-6.x series. audiofs implements the full HDA-spec-defined output bring-up plus the inspection and policy surfaces needed to make real hardware produce sound. Approximately 3500 lines, no dependency on snd_hda or its sub-layers.

Status update 2026-05-27 evening (audit-as-gate retired by ADR 0010). The gap-and-governance audit specified at audiofs/docs/snd4-gap-governance-audit.md is no longer a gate for AD-3's progression. ADR 0010 (audiofs/docs/adr/0010-retire-audit-as-gate.md, Accepted 2026-05-27 evening) records the framing change: UTF operates by build-and-replace, the audit's purpose was evidentiary not dispositive, and the audit's gate role is misaligned with UTF's actual operating mode. F.3 and onward may proceed under standard ADR-before-code discipline (each sub-stage gets its own ADR) without audit clearance.

What this changes for AD-3's outstanding-work list: the "audit-gate verification still owed" item is removed. Three substantive items remain: F.5 semasound (the userland semantic audio system per ADR 0004/0007), F.6 semaaud retirement (modelled on AD-2), and the maintenance model (the policy decision Vic owns; an explicitly-owed input per ADR 0008).

What this does NOT change: ADR 0006's decision to replace snd(4) in full stands (its rationale is principled, not measurement-contingent, per ADR 0006 lines 50-54). ADR 0008's overall structure stands (F.0-F.7 sub-stage breakdown, dependency ordering, owed inputs). The governance-independence principle in docs/UTF_ARCHITECTURAL_DISCIPLINE.md stands. ADR-before- code discipline holds. What is retired is one specific procedural step (the audit gate), not the broader discipline that produced ADRs 0001-0009.

The trade made by this retirement is recorded in ADR 0010 "Trade made by this decision": pre-implementation evidence-gathering is replaced by during-and-after implementation evidence (the substrate either works on real hardware or doesn't). The audible-output milestone of 2026-05-21 is itself such evidence; F.3+ work will produce more. The audit spec at audiofs/docs/snd4-gap-governance-audit.md is preserved as background reference material with a Status section addendum recording the role change.

Status update 2026-05-28 (F-stage reconciliation by ADR 0011). ADR 0011 (audiofs/docs/adr/0011-fstage-reconciliation.md, Accepted 2026-05-28) reconciles the F-stage map in ADR 0008 with the vertical-slice path actually taken in commits 1-6g. The reconciliation observes that the audible-output milestone proved the controller-to-DAC path is mechanically achievable end-to-end (real evidence value), but did not close F.1 (state-file publication), F.2 (events ring), or F.3 (full data path with user control) in the form ADR 0008 / the proposal required.

What this changes for AD-3's outstanding-work list: the three-item list (F.5, F.6, maintenance model) from ADR 0010 is replaced with the more honest seven-item list (F.1, F.2, F.3 sub-milestones a-f, F.4, F.5, F.6, maintenance model). The substantive requirements are not softened; the closure criteria are reframed to reflect what audiofs.c actually contains versus what each sub-stage owes. The status string above is updated accordingly.

F.3 is decomposed by ADR 0011 into six named sub-milestones (continuous streaming, user-controlled playback, interrupt- driven position tracking, underrun detection, format negotiation, HDMI bring-up) each of which will receive its own ADR before implementation. ADR 0008 had anticipated the F.3 decomposition needing its own ADR; ADR 0011 names the sub-milestones; per-sub-milestone ADRs supply the scoping for each.

What this does NOT change: ADR 0006's decision stands, ADR 0008's overall structure stands, ADR 0010's retirement of the audit-as-gate stands, ADR-before-code discipline holds (F.1, F.2, F.3.a-f, F.4 each need their own ADR before implementation). What is added is structure for the remaining work: a closure-dependency map (F.1 -> F.2 -> F.3.a -> F.3.b; F.3.c independent of F.3.a, feeds F.3.d and F.4; F.3.e and F.3.f parallel; F.4 -> F.5 -> F.6) and reframed closure criteria for F.1, F.2, F.3 that match their substantive requirements.

The next ADR after this one will scope F.1 (the state-file publication) per its reframed closure criteria.

Status update 2026-05-28 (F.1 scoped by ADR 0012). ADR 0012 (audiofs/docs/adr/0012-f1-state-file.md, Proposed 2026-05-28) scopes F.1: the state-file machinery at /var/run/sema/audio/state. The companion byte-level spec lives in shared/AUDIO_STATE.md (analogous to how shared/INPUT_STATE.md accompanies inputfs ADR 0002 and shared/CLOCK.md accompanies ADR 0003).

The state file is physics-only per ADR 0007: it publishes hardware capability (controller inventory, endpoint inventory with format-capability bitmasks) and runtime state (which endpoints are stream-active, current format if active), not policy. It uses the established UTF idiom (magic 0x54535541 "AUST", little-endian, seqlock-protected multi-field reads, 4-byte ASCII magic mnemonic encoding mirroring CLOCK and INST). Total file size in v1 is 2,624 bytes (64-byte header + 8 controller slots × 64 bytes + 32 endpoint slots × 64 bytes).

The F.1 implementation lands as a separate commit (or small series). ADR 0012 and shared/AUDIO_STATE.md are specification; the kernel-side publish/unpublish code is the F.1 implementation that follows. Expected scope: 200-400 lines of C in the kernel publish path, plus a small Zig API layer in shared/src/audio.zig (analogous to shared/src/input.zig), plus optional diagnostic reader.

ADR 0012 is Accepted (ratified 2026-05-28). F.1 implementation (kernel publish path, Zig API, optional diagnostic reader) follows.

Status update 2026-05-28 (F.1 implementation landed, [~] awaiting bench verification). The F.1 state-file publication is implemented across three commits: ADR 0012 ratified to Accepted; audiofs_state.h plus the kernel publish path in audiofs.c (module-global controller registry, endpoint enumeration from the topology walk, VFS publication mirroring inputfs, modeventhand for clean load/unload); and shared/src/audio.zig (the userspace reader, unit-tested 4/4 under Zig 0.15.1).

Verification state is split honestly:

  • Kernel publish path: [~]. The audiofs_state.h byte-level layout is compile-verified (all _Static_assert size and offset checks pass under gcc), but the code has not been built in a full PGSD kernel or loaded on hardware.
  • Zig reader: [x]. Unit-tested under Zig 0.15.1 (constants match the spec; a hand-built region round-trips; state_valid=0 and seqlock-contention edge cases behave correctly).
  • End-to-end: [~]. Owed: build the PGSD kernel with audiofs, kldload on pgsd-bare-metal, confirm /var/run/sema/audio/state exists with magic 0x54535541 and version 1, parse it (hexdump or a tool over the Zig reader), confirm the controller and endpoint inventory including the iMac internal speaker endpoint, then kldunload and confirm clean removal/invalidation. That bench pass closes F.1 per ADR 0012's criteria and flips this entry's F.1 line from [~] to [x].

Once F.1 is bench-verified, F.2 (events ring) is unblocked per the ADR 0011 closure-dependency map (F.1 -> F.2).

Status update 2026-05-28 (F.1 bench-verified [x] on pgsd-bare-metal). The kernel publish path is verified on real hardware. After clearing three deployment/build issues (none in the F.1 logic itself: a 14-commits-behind checkout on the bench machine, a missing vnode_if.h in the module SRCS, and an unused-variable -Werror stop), the module builds clean, nm confirms the audiofs_state_* symbols are in the .ko, and loading publishes the state file.

Verified results:

  • /var/run/sema/audio/state exists, root:wheel 0644, exactly 2624 bytes.
  • Header parses: magic 0x54535541 "AUST", version 1, state_valid 1, controller_count 2, endpoint_count 11, inventory_seq 2, slot counts 8/32, slot sizes 64/64.
  • Controller 0: Intel 0x8086:0xa170, subsystem 0x8086:0x7270, ISS=7 OSS=9, 64-bit. Controller 1: ATI 0x1002:0xaab0 (Oland HDMI).
  • 11 endpoints enumerated. The iMac internal speaker is present (endpoint_id 8, kind=speaker, direction=output, electrically_ready=1, runtime_active=1, current_format 0x0011), satisfying the F.1 closure criterion. Other endpoints: Line_Out, HP_Out, SPDIF_Out, a second Line_Out on the CS4206, and six Digital_Other_Out HDMI endpoints on the Oland.
  • MOD_UNLOAD fires the invalidating write (state_valid set to 0; file mtime advances), and reload cleanly republishes a valid region. This satisfies ADR 0012 closure criterion 4 ("removed, OR marked invalid by zeroing state_valid"); the code takes the invalidate path.
  • No lock or sleep warning in dmesg: VFS publication inline in device_attach works under kldload process context. The kthread-deferral pattern inputfs uses was not needed.

F.1 is closed [x]. F.2 (events ring) is now the next unblocked sub-stage per the ADR 0011 dependency map.

Two non-blocking follow-ups the bench surfaced, filed so they are not lost:

  • F.1-fu1 (unload removes vs invalidates): RESOLVED 2026-05-28. The MOD_UNLOAD handler writes an invalid region (state_valid=0) and vn_close()s the file but does not unlink it, so the file persists on disk (invalidated) when audiofs is not loaded. Investigation found inputfs does the same (invalidate-and-close, no unlink), so this is the established UTF substrate pattern, not an audiofs defect. ADR 0012 closure criterion 4 permits invalidation. Disposition: keep invalidate-and-close to stay consistent with inputfs; the only actual fix was correcting the inline comment, which had claimed "remove" while the code invalidates. Comment corrected. No behavior change.
  • F.1-fu2 (SPDIF classified as HDMI): RESOLVED 2026-05-28. SPDIF_Out endpoints were published with kind=6 (HDMI) because audiofs_state_fill_output_endpoint treated all digital output pins as HDMI. Fixed: added AUDIOFS_EP_KIND_SPDIF (value 8, first of the previously reserved range) and branched the digital-output classifier so pin-config device kind 0x4 (SPDIF_Out) maps to SPDIF while 0x5 (Digital_Other_Out) stays HDMI. Updated in lockstep across the four schema surfaces: audiofs_state.h (kernel enum), audiofs.c (classifier), shared/AUDIO_STATE.md (kind table), and shared/src/audio.zig (KIND_SPDIF + kindName). Purely additive; no region-layout change, header static-asserts and the 4 Zig reader tests still pass. The reserved range is now 9..15. DisplayPort-vs-HDMI discrimination within the 0x5 family remains deferred to F.3.f as before.

Deployment lesson recorded for future bench work: changes flow pgsd-dev -> push -> pgsd-bare-metal pull -> install -> build -> load, and a verification gate runs at each hop (grep the source the next stage will consume; nm the .ko before loading). Three separate "fix did not reach the build" incidents this session were all caught by those gates rather than by a misleading load result. The gates belong in the audiofs bench-test doc.

Status update 2026-05-28 (F.2 scoped by ADR 0013). ADR 0013 (audiofs/docs/adr/0013-f2-events-ring.md, Proposed 2026-05-28) scopes F.2: the events ring at /var/run/sema/audio/events. The companion byte-level spec is shared/AUDIO_EVENTS.md.

The ring mirrors shared/INPUT_EVENTS.md closely (decision owner chose maximal consistency with the established substrate): 64-byte header, 64-byte event slots, power-of-two slot count (256 in v1, total 16,448 bytes), lock-free single- producer/multi-consumer with the seq-published-last writer protocol and seq-revalidation reader protocol, plus a pollable notification fd mirroring inputfs ADR 0021's /dev/inputfs_notify. Magic 0x41554556 "AUEV", version 1.

Event taxonomy uses two-level (source_role, event_type) dispatch like inputfs: role 1 = stream (stream_begin, stream_end, xrun, format_change), role 2 = endpoint-lifecycle (endpoint_attach, endpoint_detach, inventory_full). Physics- only per ADR 0007: the xrun payload reserves gap_sample_pos and gap_frames (the physics fact of where/how-big the gap was), and each event carries ts_sync (audio sample position). Both fields are reserved now so F.3.d (xrun detection) and F.4 (clock writer) populate existing fields rather than breaking the wire format.

F.2 emits endpoint-lifecycle events immediately (endpoints exist as soon as controllers attach). Stream events have their schema fully specified but are emitted only once the data path creates real streams (stream_begin/end at F.3.a, format_change at F.3.e, xrun at F.3.d). The events publisher writes the state region's last_event_seq, closing the correlation loop F.1 left open.

ADR 0013 is Accepted (ratified 2026-05-28). F.2 implementation (kernel publish path, audiofs_events.h, notify cdev, Zig EventRingReader) follows.

Status update 2026-05-28 (F.2 implementation landed, [~] awaiting bench verification). F.2 is implemented across two commits: the kernel publish path (audiofs_events.h plus the events code in audiofs.c: in-kernel ring buffer, seq-last publish protocol, endpoint_attach emission wired into the controller register, the /dev/audiofs_notify pollable cdev mirroring inputfs AD-41.3, and the last_event_seq correlation into the state region) and the Zig EventRingReader in shared/src/audio.zig.

Verification state, split honestly as for F.1:

  • Kernel publish path: [~]. The audiofs_events.h byte-level layout is compile-verified (all _Static_assert checks pass under gcc), and the publish code is audited for the -Werror traps this session has hit (unused symbols, missing includes sys/poll.h / sys/event.h, the atomic u64 casts, uid/gid constants), but it has not been built in a full PGSD kernel or loaded on hardware.
  • Zig reader: [x]. 8/8 audio.zig tests pass under Zig 0.15.1 (4 F.1 state + 4 new F.2 events: constants, endpoint_attach drain, xrun payload decode, overrun detection).
  • End-to-end: [~]. Owed: build the PGSD kernel, run the nm gate (audiofs_events_* and audiofs_notify_* symbols in the .ko), kldload on pgsd-bare-metal, confirm /var/run/sema/audio/events exists (16448 bytes, magic 0x41554556, version 1, ring_valid 1), endpoint_attach events flow (writer_seq == endpoint count, 11 on the bench iMac), the state region last_event_seq matches the ring writer_seq, and /dev/audiofs_notify wakes a reader. Then kldunload clean, reload clean. That bench pass closes F.2 per ADR 0013 and flips the F.2 line to [x].

The deployment-gate discipline from the F.1 bench session applies: pull/apply on the build machine, grep the deployed source, nm the .ko for the expected symbols, then load. F.2 adds more symbols to check (the events and notify functions).

Once F.2 is bench-verified, F.3.a (continuous streaming) is the next sub-stage per the ADR 0011 dependency map (F.1 -> F.2 -> F.3.a -> F.3.b).

Status update 2026-05-28 (F.2 bench-verified [x] on pgsd-bare-metal). The events-ring publish path is verified on real hardware. After one build fix (knlist_init takes 5 args on this FreeBSD, not 6; the in-tree inputfs was the authoritative reference), the module builds clean, nm confirms the audiofs_events_* and audiofs_notify_* symbols in the .ko, and loading publishes the ring.

Verified results:

  • /var/run/sema/audio/events exists, root:wheel 0644, exactly 16448 bytes.
  • Header parses: magic 0x41554556 "AUEV", version 1, ring_valid 1, event_size 64, slot_count 256, writer_seq 11, earliest_seq 1.
  • writer_seq 11 equals the endpoint count: one endpoint_attach event per enumerated endpoint (the same 11 endpoints F.1 publishes). First slot decodes to seq 1, source_role 2 (endpoint), event_type 1 (attach), a real nanouptime ts_ordering, ts_sync 0 (correct; F.4 not yet), payload endpoint_id 6 kind 3 (Line_Out) direction 1 (output), matching the F.1 inventory's slot 0.
  • The correlation loop is closed: the state region's last_event_seq now reads 11 (was 0 under F.1 alone), matching the events ring writer_seq. The events publisher correctly writes back into the state region.
  • No lock or sleep warning in dmesg: the selwakeup / KNOTE calls from the publish path and the VFS-in-attach-context both work under kldload. The kthread-deferral pattern was not needed (same result as F.1).
  • /dev/audiofs_notify cdev created (symbols present); the publish path calls selwakeup + KNOTE_UNLOCKED on each event. A poll/kqueue wake test would exercise criterion 6 end to end; the cdev and wake calls are in place.

F.2 is closed [x]. F.3.a (continuous streaming) is now the next unblocked sub-stage per the ADR 0011 dependency map.

One build fix recorded (folded into the consolidated patch): knlist_init arg count 6 -> 5. This was the only signature mismatch; knlist_add/remove/destroy, seldrain, selwakeup, selrecord, KNOTE_UNLOCKED, and make_dev_p all matched in-tree inputfs on the first try. The build gate caught the knlist_init mismatch before load, consistent with the deployment-gate discipline from the F.1 session.

Status update 2026-05-29 (F.3.a scoped by ADR 0014). ADR 0014 (audiofs/docs/adr/0014-f3a-continuous-streaming.md, Proposed 2026-05-29) scopes F.3.a: continuous streaming via a kthread-driven buffer refill loop, in-kernel audiofs_stream_begin / audiofs_stream_end lifecycle entry points, F.2 stream_begin / stream_end event emission, and conversion of the attach-time test tone to use the new continuous-stream API.

Key design decisions (decision-owner choices):

  • Refill cadence: per-stream kthread polling LPIB at 10 ms intervals. ADR 0011 places interrupt-driven position tracking in F.3.c; F.3.a stays cleanly within its scope by polling. The kthread is intentionally the placeholder F.3.c will replace with an interrupt handler.
  • Data source: continuous sine wave (the existing commit-6 waveform, looped). Hearable proof at bench; F.3.b will replace with a real source.
  • Attach behavior: the existing one-shot test tone is converted to a stream_begin call. After kldload, the speaker plays continuously until kldunload. That is the F.3.a closure proof. Bench iteration of F.3.a uses kldunload as the off switch. The operational consequence is documented; this is deliberate, not a surprise.
  • One-shot helpers removed: audiofs_run_output_stream and the LPIB sampling loop, along with the related one-shot diagnostic log events, are retired. The information they provided (does LPIB advance) is now ambient in the refill kthread's continuous polling. This is the build-and-replace framing from ADR 0010 applied: when an idea is superseded, it goes, not parked.

What F.3.a populates: the stream_begin and stream_end event payloads reserved by ADR 0013 / shared/AUDIO_EVENTS. stream_begin carries stream_id, format (0x0011), channels (2), rate_hz (48000). stream_end carries stream_id and frames_total derived from cumulative LPIB delta with wrap accounting. xrun (type 3) and format_change (type 4) stay schema-reserved until F.3.d and F.3.e. ts_sync stays 0 until F.4. No wire-format change.

What F.3.a does NOT do (per ADR 0011's sub-milestone boundaries):

  • F.3.b: user-facing API. F.3.a's entry points are in-kernel callable; F.3.b wraps them.
  • F.3.c: interrupts. F.3.a polls LPIB from a kthread.
  • F.3.d: xrun detection. F.3.a's kthread keeps refills ahead of consumption under normal conditions; observed xruns are diagnostic only until F.3.d wires them to the events ring.
  • F.3.e: format negotiation. F.3.a hardcodes 48k/16/stereo.
  • F.3.f: HDMI bring-up.
  • F.4: clock writing. ts_sync stays 0.

ADR 0014 is Accepted (ratified 2026-05-29). F.3.a implementation (audiofs.c changes for the kthread, entry points, attach rewrite, removal of one-shot helpers, plus small Zig event-helper additions) follows.

Status update 2026-05-29 (F.3.a implementation landed, [~] awaiting bench). F.3.a is implemented across two code commits on top of the intermediate removals:

  • Kernel publish path (audiofs.c): audiofs_stream_begin / audiofs_stream_end as the in-kernel lifecycle entry points (F.3.b will wrap them in a user surface, F.3.c will replace the kthread refill with an interrupt path); audiofs_refill_worker kthread polling SDnLPIB at 10 ms; audiofs_refill_sine_fragment helper. Lock ordering: hw_lock for register writes / CORB commands, state_sx for F.2 event emission, no recursive locks. The stream_begin call moved out of audiofs_walk_topology (which runs under hw_lock) and into audiofs_attach (after the lock is released and after audiofs_state_register so the endpoint inventory is published), so stream_begin can take hw_lock cleanly without recursion.

  • Zig EventRingReader (shared/src/audio.zig): Event.streamBegin and Event.streamEnd payload decoders. 10/10 audio.zig tests pass under Zig 0.15.1 (4 F.1 + 4 F.2 endpoint/xrun/overrun + 2 new F.3.a stream events).

Verification state:

  • Zig reader: [x]. 10/10 tests pass.
  • Kernel path: [~]. Compile-audited (braces balanced, em-dash check clean, all symbols defined and referenced, lock ordering reviewed: walk_topology no longer calls stream_begin so no recursion; stream_end called from detach before hw_lock is taken for reset). Not yet built in a full PGSD kernel.
  • End-to-end: [~]. Owed: build the PGSD kernel, run the nm gate (audiofs_stream_begin, audiofs_stream_end, audiofs_refill_worker, audiofs_refill_sine_fragment symbols in the .ko), kldload on pgsd-bare-metal, confirm the iMac internal speaker plays a continuous 750 Hz sine wave, the F.2 events ring shows a stream_begin event at attach (writer_seq advances to 12 from 11), and on kldunload the speaker stops cleanly with a stream_end event whose frames_total is consistent with the elapsed runtime.

Operational reminder per ADR 0014: bench iteration of F.3.a uses kldunload as the off switch. The iMac sings until then.

The deployment-gate discipline from earlier sessions applies: pull on the build machine, grep the deployed source for new symbols, nm the .ko for them before loading. The predictable -Werror traps were audited (the audit caught a recursive-lock bug in the first draft and a missing hw_lock around a CORB-using send_command; both were fixed before patch generation).

Status update 2026-05-29 (F.3.a bench-verified [x] on pgsd-bare-metal). F.3.a is closed on real hardware with the amended closure criteria from ADR 0014 (post-bench safety amendment, same date).

Bench history, three iterations:

  1. First load: stream_begin succeeded, stream_end hung the machine on kldunload (msleep_spin used on an MTX_DEF mutex). Operator had to restart. Fixed: msleep_spin -> msleep.
  2. Second load: clean kldunload, frames_total 272157 consistent with ~5.7 seconds elapsed, all events correlated, but speaker was silent. Diagnosis: the platform-policy table lookup keyed on controller PCI subsystem (Intel) instead of codec FG subsystem (Apple); never matched the iMac entry; the speaker amp gate stayed off. Fixed: lookup uses codec->fg_subsystem.
  3. Third load: sound came out. Loud. Operator could not silence through SSH; had to pull power. Fixed: sine amplitude dropped 100x (-6 dBFS -> -40 dBFS) AND autoplay made opt-in via hw.audiofs.test_tone tunable (default 0). ADR 0014 amended to reflect the new operational consequence.
  4. Fourth load (this one): clean default-silent load (stream_begin_skipped_tone_off events in the log, writer_seq=11 with no stream_begin), runtime tunable toggle 0->1 produced quiet sine, toggle 1->0 stopped it cleanly. kldunload clean after toggling. The in-band off switch works. F.3.a closed.

Verified bench results:

  • default kldload: silent, stream_begin_skipped_tone_off events emitted (one per controller). writer_seq=11.
  • sysctl hw.audiofs.test_tone=1: iMac internal speaker plays continuous 750 Hz sine at room-comfortable volume via the F.3.a kthread refill loop. writer_seq advances (stream_begin event).
  • sysctl hw.audiofs.test_tone=0: stream_end fires cleanly via the in-band off switch. Speaker stops with no click. writer_seq advances (stream_end event).
  • build.sh unload after the toggle cycle: clean, no hang, no kthread leak.
  • State <-> events correlation invariant preserved throughout (last_event_seq tracks writer_seq).

Three real bugs surfaced and fixed during F.3.a bench:

  • msleep primitive mismatch (msleep_spin requires MTX_SPIN; hw_lock is MTX_DEF). Fixed in commit d760589.
  • Platform-policy lookup key (controller PCI subsystem vs codec FG subsystem; the latter identifies the board on Macs). Fixed in commit b2d3439. This was a pre-existing commit-6g bug, invisible until F.3.a required audibility.
  • Bench-safety operational miscalibration (-6 dBFS at gain 115 was unbearable continuously; no in-band off switch). Fixed in commit 1365098 with ADR 0014 amendment.

Discipline lesson recorded in ADR 0014's amendment section: the design contract is on paper, but bench reality reserves the right to amend when the original framing was wrong about real operational impact.

F.3.a is closed [x]. F.3.b (user-facing control API) is the next sub-stage per the ADR 0011 dependency map (F.3.a -> F.3.b). The audiofs_stream_begin / audiofs_stream_end signatures and the F.2 stream event payloads are now stable; F.3.b will wrap them.

Status update 2026-05-30 (F.3.b implementation landed, [~] awaiting bench). F.3.b ratified to Accepted (ADR 0015) and implemented in two coordinated commits:

  • Kernel publish path (audiofs.c): new cdevsw with open / close / write / read / poll / ioctl handlers; 32 KB user ring per controller (malloc'd at attach, head/tail size_t cursors with power-of-2 mask); new user_ring_mtx (MTX_DEF, also the back-pressure msleep address); 3-state source machine (stopped / running-sine / running-user); audiofs_source_set helper; audiofs_refill_user_fragment with shortfall zero-fill and underflow counting; audiofs_refill_fragment dispatcher consulting source under user_ring_mtx. cdev created in attach via make_dev_s, destroyed FIRST in detach via destroy_dev so in-flight ops drain before stream teardown. The test_tone sysctl handler updated to respect cdev_open (cdev consumer wins; tunable still recorded so cdev_close consults it).

  • Userland bench tool (audiofs/tools/playtone/): a small C program that writes a bounded N seconds of quiet sine to /dev/audiofs and exits. This is the bench-safety gate from ADR 0015: bounded process lifetime bounds audible time, preventing a repeat of F.3.a's pulled-power scenario.

Verification state:

  • Kernel path: [~]. Compile-audited (braces 282/282, em-dashes 0, all 9 new symbols defined and referenced, lock-ordering reviewed: state_sx -> user_ring_mtx -> hw_lock; no recursive locks; destroy_dev sequenced first in detach so in-flight cdev ops drain before stream teardown). Not yet built in a full PGSD kernel.

  • playtone: not yet built or run.

  • End-to-end: [~]. Owed: build the PGSD kernel; nm the .ko for new symbols (audiofs_cdev_open / _close / _write / _read / _ioctl / _poll, audiofs_source_set, audiofs_refill_fragment, audiofs_refill_user_fragment); kldload on pgsd-bare-metal; ls -l /dev/audiofs0 (mode 0666 root:wheel); build playtone; run ./playtone /dev/audiofs0 1 and hear ~1 second of quiet sine; verify F.2 stream_begin event on open and stream_end event on close with frames_total ~48000 (1 second at 48 kHz); verify EBUSY on double-open; verify SIGKILL on a running writer cleans up state; test source swap (test_tone=1; cdev_open should swap sine to user data without a stream restart, single stream_begin in the F.2 ring); test back-pressure (slow writer triggers msleep without crashing the kthread); kldunload clean after all the above.

Known v1 behavior (acceptable per ADR 0015 closure criteria):

  • Cold-open path has ~85 ms of pre-existing sine leak from the BDL initial fill before the kthread's first refill iteration switches to user data. Documented in cdev_open block comment; F.3.c may address.

  • close() in v1 does not drain queued audio (up to ~210 ms lost on close). Documented in ADR 0015.

  • Underflow counter accumulates but emits no F.2 event; F.3.d will surface as xrun events.

The deployment-gate discipline applies: pull, install, build, nm gate for the new symbols, then load. Bench iteration uses playtone (bounded process lifetime) rather than indefinite manual writers. The F.3.a discipline lesson holds: design contract on paper, bench reality reserves the right to amend if the operational impact differs from the predicted one.

Status update 2026-05-30 (F.3.b bench-verified [x] on pgsd-bare-metal). F.3.b is closed on real hardware. All seven ADR 0015 closure criteria are met across two bench sessions:

Bench session 1 (basic audibility, ADR 0015 criteria 1-3):

  • /dev/audiofs0 cdev exists, mode 0666 root:wheel.
  • playtone /dev/audiofs0 1 wrote 192000 / 192000 bytes; iMac internal speaker played 1 second of quiet 750 Hz sine via the kthread refill loop drawing from the user ring.
  • F.2 stream_begin event on cdev_open, stream_end on cdev_close. frames_total=40815 (~850 ms of consumed fragments before close-doesn't-drain truncated; v1 behavior documented in ADR 0015).
  • State <-> events correlation invariant preserved (writer_seq=13 at end of session matched state file's last_event_seq).

Bench session 2 (criteria 4-7, scripted via audiofs/bench-f3b.sh): 14 PASS / 0 FAIL / 0 WARN.

  • Criterion 4 (source swap): with hw.audiofs.test_tone=1 set first, cdev_open swapped source from SINE to USER atomically without a stream restart. cdev_open arg=0x0 confirmed needs_stream_begin=0; writer_seq advanced 11 -> 13 (one tunable-driven stream_begin pair spanning both controllers) but did NOT advance through the cdev_open / cdev_close window. cdev_close arg=0x1 confirmed want_sine=1, swapping back to SINE without a stream_end.

  • Criterion 5 (back-pressure): 3-second playtone took 2.857 sec wall-clock (would be ~0 sec if back-pressure were broken; would be infinity if deadlocked). Ring drain rate gating write(2) is working as designed. frames_total=136788 in expected 100k-150k range.

  • Criterion 6 (exclusive open / cleanup): double-open correctly returned EBUSY ("playtone: open /dev/audiofs0: Device busy"). After holder released, second playtone succeeded. SIGKILL on a 10-sec playtone victim triggered D_TRACKCLOSE-driven cleanup; subsequent open succeeded immediately. No stuck cdev_open flag, no leaked kthread, no DMA leak.

  • Criterion 7 (no deadlock/panic/leak): clean kldunload after the full test sequence; no panic, LOR, or abandoned-kthread indicators in dmesg.

audiofs1 anomaly resolution: the previous bench session's "audiofs1 emitted stream events without OSS" puzzle was my misreading of an earlier dmesg. The current bench confirms audiofs1 has OSS=6 (AMD discrete GPU HDMI controller with six Digital_Other_Out paths); stream_begin on audiofs1 succeeds, the kthread runs, LPIB advances. The F.3.b cdev exists for audiofs1 as well; whether HDMI sound reaches an attached display is F.3.f territory, but the audiofs kernel side works on the controller.

Three real bugs surfaced and fixed during the F.3.b bench:

  • playtone Makefile man-page wart (NO_MAN=1 deprecated; fixed to MAN= empty). The binary built but make rc was non-zero; bench-f3b.sh's pre-build check would have masked it. Fixed in commit 20f8c69.
  • bench-f3b.sh path bug (script placed at audiofs/tools/ but referencing ${SCRIPT_DIR}/tools/playtone/playtone, yielding audiofs/tools/tools/playtone/playtone). Fixed by relocating script to audiofs/bench-f3b.sh alongside build.sh. Fixed in commit 20f8c69.
  • (No kernel bugs surfaced during F.3.b bench; both issues were tooling, not the kernel implementation. The F.3.a session's audit discipline caught the kernel-side bugs before bench.)

F.3.b is closed [x]. The audiofs kernel side now has a complete user-controlled output path: applications open the cdev, write samples, close; the kthread refills BDL fragments from the user ring with back-pressure on full and silence on empty. semasound (per ADR 0005) is the intended consumer; it can be written against the F.3.b surface as soon as F.5 work begins.

F.3.c (interrupt-driven position tracking) is the next sub-stage per ADR 0011's dependency map. F.3.c swaps the F.3.a kthread polling for the real HDA interrupt handler. F.3.d (xrun detection) depends on F.3.c. F.3.e (format negotiation) depends on F.3.b (now closed). F.3.f (HDMI) is parallel and can be taken at any time.

Status update 2026-05-30 (F.3.c implementation landed, [~] awaiting bench). F.3.c ratified to Accepted (ADR 0016) and implemented in one substantive commit (64d6716). Changes:

  • Kernel interrupt path (audiofs.c): new audiofs_intr_filter (filter context, MTX_SPIN intr_lock, three register I/Os max) and audiofs_intr_thread (ithread context, hw_lock + user_ring_mtx) replace the F.3.a polling kthread. audiofs_refill_worker deleted along with its kproc_create/exit, stop_requested signalling, and msleep-on-hw_lock wait. output_stream_running renamed to output_stream_active (12 sites).

  • IRQ resource lifecycle (audiofs_attach / audiofs_detach): pci_alloc_msi attempted first (single vector); fall back to INTx (RF_SHAREABLE | RF_ACTIVE) if MSI not granted with count=1. bus_setup_intr registers filter+ithread handlers under INTR_TYPE_AV | INTR_MPSAFE. bus_teardown_intr in detach blocks until any in-flight ithread completes; then IRQ release and pci_release_msi as appropriate. Setup failure is a hard attach error (no polling fallback per ADR 0016).

  • stream_begin / stream_end ordering (the critical race-free choreography): stream_begin: configure -> DAC bind -> events publish -> active=1 (intr_lock) -> SIE+GIE+CIE in INTCTL (hw_lock) -> dma_sync -> RUN (hw_lock). First interrupt fires within ~21 ms. stream_end: active=0 (intr_lock; ithread entry guard now rejects) -> SIE clear (hw_lock) -> RUN clear (hw_lock) -> final LPIB read -> DAC unbind -> stream_end event. No msleep wait; no abandonment timeout.

  • BDL IOC bits flipped: configure_output_stream writes ioc=htole32(1) on both entries (was 0). One interrupt per ~21 ms fragment, ~47 interrupts/sec/ stream at 48k/16/stereo.

  • Diagnostics: new sysctls dev.audiofs..interrupts_setup (read-only string: "msi" / "intx" / "none") and underflow_count (read-only uint64). New audiofs_log entries intr_setup_msi / intr_setup_intx at attach; intr_teardown at detach; irq_alloc_failed and irq_setup_failed on attach error paths.

  • Comment refresh: F.3.a streaming-header block rewritten to document the F.3.c interrupt model and the four-tier lock order (state_sx -> user_ring_mtx -> hw_lock -> intr_lock). F.3.b user-ring comments updated to "ithread drain" instead of "kthread drain". The historic "mirrors hdac.c" scaffolding comments were left alone where they refer to register-level sequences (still spec-accurate); only the runtime descriptions that referenced the kthread were updated.

  • Removed: AUDIOFS_REFILL_POLL_TICKS, AUDIOFS_STREAM_STOP_TIMEOUT macros (unused after kthread retirement); #include <sys/kthread.h>.

Audit performed BEFORE commit (the F.3.b discipline lesson, expanded for F.3.c's hardware-shaped surface):

  • Brace balance 297/297, 0 em-dashes.
  • All three new symbols (audiofs_intr_filter, audiofs_intr_thread, audiofs_sysctl_interrupts_setup) have forward decl + definition + call site.
  • No msleep_spin on MTX_DEF anywhere. The one msleep remaining is F.3.b's back-pressure msleep on user_ring_mtx (MTX_DEF), correct.
  • Lock acquisitions in ithread never overlap (intr_lock released before hw_lock acquired; hw_lock released before refill_fragment takes user_ring_mtx). No recursive locking.
  • Filter handler accesses only INTSTS and SDnSTS, both of which are exclusively owned by the interrupt path (verified by grep). Other registers (INTCTL, SDnCTL, SDnLPIB, etc.) are hw_lock-only. No torn-read risk between filter and other paths.
  • stream_begin order (active=1 -> SIE -> RUN) and stream_end order (active=0 -> SIE clear -> RUN clear -> final LPIB) close the SIE-cleared-but-ithread-already-scheduled race via the entry guard.
  • pci_alloc_msi failure handling distinguishes "did not succeed" (do not release) from "succeeded but wrong count" (release before retry to INTx).
  • LPIB delta wrap arithmetic same as F.3.a's polling worker (proven correct in F.3.a/b bench).

Verification state:

  • Kernel path: [~]. Compile-audited (braces balanced, em-dashes 0, all new symbols defined and referenced, lock-ordering reviewed). Not yet built in a full PGSD kernel.
  • End-to-end: [~]. Owed: build the PGSD kernel; nm the .ko (audiofs_intr_filter / _thread / sysctl_interrupts_setup present; audiofs_refill_worker ABSENT); kldload; verify dmesg shows intr_setup_msi or intr_setup_intx; verify dev.audiofs.0. interrupts_setup reports "msi" or "intx"; rerun bench-f3b.sh (the F.3.b 14-PASS suite is the F.3.c gate per ADR 0016 closure criterion 2); verify ps -auxw | grep audiofs_refill is empty (kthread really gone); verify vmstat -i shows ~47 audiofs interrupts/sec under sustained playtone load (not thousands; not zero); confirm clean kldunload.

The deployment-gate discipline applies: pull, install, build, nm gate for the new symbols and absence of the kthread, then load. Bench iteration via bench-f3b.sh remains the same (the suite is unchanged; F.3.c's behavioral invariants are required to be identical to F.3.b's from userland's perspective).

Known v1 caveats (unchanged from F.3.b):

  • ~85 ms cold-open sine leak from BDL initial fill.
  • close() does not drain queued audio.
  • Underflow counter accumulates internally; F.3.d will surface as F.2 xrun events.

If a bench iteration surfaces a kernel bug despite the pre-bench audit, the F.3.a discipline lesson applies: when two successive fix iterations fail, stop theorizing and read the spec.

Status update 2026-05-31 (F.3.c bench-verified [x] on pgsd-bare-metal). F.3.c is closed on real hardware. bench-f3b.sh, the F.3.b verification suite, runs unchanged under the new interrupt path and produces 14 PASS / 0 FAIL / 0 WARN. ADR 0016's closure criterion 2 (F.3.b's 14-PASS suite must continue to pass) is met, which is the strongest evidence the userland-visible behavior of audiofs is unchanged from F.3.b's verified baseline.

Two bench iterations were needed before close; both surfaced real kernel bugs that the pre-bench audit had missed:

  • Iteration 1 (commit 7237d41): SDnCTL IOCE / FEIE / DEIE bits were not set. The HDA spec (section 3.3.35) gates the BDL IOC honoring on the stream-level IOCE bit; without it, the controller treated the IOC=1 BDL entries as undefined-state triggers and stalled DMA after one fragment. Bench symptom: brief audible audio (one fragment) then silence; writer blocked indefinitely on back-pressure msleep; frames_total reported < 5 ms over 35 sec wall-clock windows.

  • Iteration 2 (commit fc03609): INTSTS / INTCTL bit positions for stream interrupts use the GLOBAL stream-descriptor enumeration (input streams first, then output streams), but the code used the local output-stream-index (1 << output_stream_idx) without the num_iss offset. For the Intel HDA in the iMac (num_iss=4), this set/checked bit 0 instead of bit 4, so the filter handler returned FILTER_STRAY on every interrupt and the ithread never ran. Bench symptom: audible continuous looping sine (the prefilled BDL looping forever); writer still blocked; frames_total still tiny because the ithread's LPIB-delta accumulation never executed. (The bug did not affect audiofs1 because num_iss=0 on the AMD HDMI controller, so output stream 0 happens to map to bit 0 there.)

After both fixes, bench-f3b.sh's 14 closure criteria all pass: cdev semantics, back-pressure timing (~3 sec wall-clock for 3-sec writes), exclusive open with EBUSY on double-open, SIGKILL cleanup via D_TRACKCLOSE, writer_seq advancement, no panic / LOR / abandoned indicators, clean kldunload.

Discipline lesson recorded: pre-bench audit caught all of F.3.b's potential kernel bugs (zero bench iterations for kernel issues; only tooling), but caught zero of F.3.c's kernel bugs (two bench iterations to surface both). The difference is hardware-shaped semantics: F.3.b's risks were concurrency (lock-class mismatch, race windows, recursive locks) which the audit reasoned about well; F.3.c's risks were spec-derived register semantics where the audit needed to verify against the HDA 1.0a spec, not just against the source code. Future hardware-shaped sub-stages: the audit must include a spec re-read alongside the code re-read. Specifically: when the design touches register-level enable bits, the audit must trace each enable from its register definition in the spec through to the code that sets/clears it, verifying that the bit position matches the enumeration order documented in the spec.

The four-level enable structure that bit us:

  • PCI level: pci_alloc_msi / bus_setup_intr (caught correctly in design).
  • Controller level: INTCTL GIE / CIE / SIE (SIE bit position bug, iteration 2).
  • Stream level: SDnCTL IOCE / FEIE / DEIE (missing entirely, iteration 1).
  • Source level: BDL entry IOC (caught correctly in design).

Each layer can independently gate the interrupt path. The audit should enumerate all four explicitly.

The interrupt path is now the only path for stream progression: the F.3.a polling kthread is fully retired (audiofs_refill_worker gone, audiofs_refill kproc absent from ps), the ithread refills BDL fragments from the user_ring as interrupts fire (~47/sec/stream at 48k/16/stereo), the back-pressure msleep on the writer side gets woken correctly when the ring drains, and stream_end completes synchronously without msleep waits because the active flag gates ithread entry and bus_teardown_intr blocks until in-flight ithread invocations complete.

F.3.c is closed [x]. F.3.d (xrun event surfacing) is the next unblocked sub-stage; it converts the output_stream_underflow_count counter (already accumulated by the F.3.c ithread on FIFOE) into F.2 xrun events. F.4 (clock writer) can now be designed against the interrupt-paced frames_played value F.3.c maintains. F.3.e (format negotiation) is unblocked and parallel. F.3.f (HDMI) is parallel and can be taken at any time.

[ ] AD-4: Graphics output: replace efifb / DRM dependency (Open, Large; not scheduled)

drawfs currently uses efifb (or DRM/KMS on capable hardware) for display output. Both are accepted as platform transport today. Direct GPU programming would be the largest dependency replacement UTF could undertake.

This is the biggest scope item in the discipline's "in scope for review" list. Vendor-specific GPU programming, command submission, power management, and multi-vendor support make this a multi-year undertaking even for a single vendor. No design document exists yet. Not scheduled.

[ ] AD-11: Console and recovery: pgsd-sessiond as universal login surface (Open, Medium; reframed 2026-05-21)

Tracks: a future ADR (AD-11.1 below) and three small mechanism sub-items. Depends on SM-1 (pgsd-sessiond) already done; depends on rc.d service-lifecycle infrastructure from AD-12 already done.

Reframed 2026-05-21 under "convergent paths" model. The previous AD-11 framing (kernel-side console replacement, consfs, drawcons design ADR, kernel-side panic rendering) contemplated a much larger commitment than necessary. The reframed AD-11 is small and stays inside UTF's existing substrate: every path that today ends at a vt(4) ttyvN shell instead ends at pgsd-sessiond's login prompt rendered through drawfs, on the same input substrate (inputfs) and the same display substrate (drawfs) UTF sessions normally use.

Three convergent paths, one endpoint:

  1. Normal bootstrap. Standard rc.d ordering completes; pgsd-sessiond starts and presents the login prompt. User logs in as their normal account into a normal UTF session. No change from today.

  2. Operator-selected recovery (Alt held during bootstrap). The loader or early rc.d stage detects Alt held at a known point in boot. Instead of the normal session profile, rc.d selects a recovery profile: minimal services started, recovery tools on PATH, defaults biased toward root login. pgsd-sessiond still presents the login prompt; user logs in (typically as root) and recovers. The recovery session is a normal UTF session with a different startup profile, not a separate console layer.

  3. Auto-recovery (panic or detectable bootstrap failure). On boot the system reads a persisted "last boot did not complete cleanly" marker (NVRAM, /boot, or a marker file on a known-good filesystem). If present, bootstrap selects the recovery profile automatically, same as if Alt were held. The marker is set by the panic handler before reboot, by detectable rc.d failures (ZFS import fails, critical service refuses to start), or by user action ("reboot into recovery next time"). The marker is cleared once a recovery session is started, so the next normal boot is normal again.

Use cases for path 3 include not only failure recovery but routine administration: changing boot environments, rolling back a ZFS BE, repairing a configuration that produces a broken normal session. Recovery is a normal mode of operation that users choose, not only a state the system falls into after breakage.

What this commits UTF to owning:

  • Detection of the recovery condition at three trigger points (Alt at boot, persisted marker, detectable failure during bootstrap).
  • A recovery session profile, selectable by rc.d or pgsd-sessiond based on the trigger.
  • The persisted-marker mechanism, including who writes it (panic handler, failing rc.d scripts, user-invoked "reboot into recovery") and when it clears.

What this commits UTF NOT to owning:

  • Kernel-side text rendering for panic messages. If the kernel panics during a window where drawfs is not yet up (very early boot) or where the panic prevents reboot (extremely rare), there is no UTF console to display the message; the operator's path is to boot from external media. This is the same posture FreeBSD GENERIC takes when vt(4) cannot initialise.
  • A getty-equivalent or TTY-like session abstraction. pgsd-sessiond is the login surface; UTF sessions are the post-login environment. Job control, line discipline, and other TTY semantics are provided by the shell process under the session, not by UTF itself.
  • Owning the boot console messages from the FreeBSD kernel before drawfs loads. Those continue to go to wherever FreeBSD writes them. On a PGSD kernel with AD-39's compile-out, there is no console-on-framebuffer driver to receive them, so they have no on-screen destination until drawfs.ko loads; messages still reach the dmesg ring buffer and any configured serial console. (Pre-AD-39 kernels with vt_efifb compiled in put boot messages on the framebuffer via that driver; PGSD does not. There is no separate efifb driver in FreeBSD distinct from vt_efifb. The previous wording of this entry, which implied one, was wrong; corrected 2026-05-27 evening after audit.) UTF replaces the user-facing console, not the kernel's own diagnostic output channel.

Sub-stages:

  • AD-11.1: write the ADR. Position UTF's recovery posture explicitly. Settle the persisted-marker mechanism (where does the marker live, who can write it, when does it clear) and the Alt-detection mechanism (loader-level via kenv, or early rc.d via inputfs reading a designated key, or some other approach). Document the explicit non-commitment to kernel-side panic rendering and the external-media fallback for unrecoverable cases.

  • AD-11.2: implement the bootstrap-time trigger mechanism. Alt detection plus marker-file read; both resolve to a single boolean visible to subsequent rc.d stages and to pgsd-sessiond. Bench-verified by holding Alt at boot and observing the recovery profile activate.

  • AD-11.3: implement the recovery session profile. An rc.d profile variant plus a pgsd-sessiond configuration that adjusts defaults. Bench-verified by activating recovery via Alt and via marker, and confirming the session behaves as designed (recovery tools on PATH, minimal services, root login pre- selected if that is the chosen design).

  • AD-11.4: implement the marker-writing paths. Panic handler hook to set the marker before reboot. rc.d hook to set the marker on detectable failure. User-invoked reboot-into-recovery command to set the marker explicitly. Marker-clearing once a recovery session is started.

  • AD-11.5: bench verification across the three trigger paths. Hold Alt at boot. Boot from a normal session with a deliberately broken config to trigger rc.d failure path. Trigger a panic (using existing panic-injection mechanisms from AD-9 fuzz harness) and verify the next boot enters recovery automatically. Confirm that an operator-initiated reboot-into-recovery does the same.

Asymmetry vs the previous AD-11 framing: the old entry contemplated kernel-side console rendering, a consfs substrate, and panic-resilient text output. Those commitments would have made AD-11 the largest single piece of work in the UTF backlog. The reframed AD-11 stays inside the session layer where UTF already operates; the new work is detection, profile selection, and a small persisted-marker mechanism. Estimated scope drops from Large to Medium.

Discipline framing: the original AD-11 entry argued that vt(4) competing with UTF surfaces during boot is the kind of "external code that does not share UTF's commitments" the discipline doc warns about. That argument applies to the post-boot, post-drawfs-loaded console, not to early-kernel-printf or panic messages. The reframed AD-11 owns the former and explicitly does not own the latter. UTF treats early-kernel and panic output as platform transport, the same posture inputfs takes toward early-boot keyboard input via kbdmux before inputfs attaches.

Depends on:

  • SM-1 (pgsd-sessiond): already done. AD-11 extends SM-1's session-profile machinery; no new dependency burden.
  • AD-12 (service lifecycle): already done. AD-11's recovery-profile rc.d work fits inside the existing service-lifecycle conventions.
  • AD-39 (kernel console drivers compiled out): already done. AD-39 removed vt(4) from PGSD; the "compete for framebuffer" failure mode no longer exists. AD-11 is not about reclaiming the framebuffer from vt(4) (AD-39 did that); it is about giving operators a recovery path on a UTF-owned display.

(The previous AD-11 entry listed AD-10 as a dependency. AD-10 was superseded by AD-39 on 2026-05-13; the reframed AD-11 depends on AD-39 instead. AD-4 graphics output replacement is no longer a partial dependency because the reframed AD-11 does not contemplate kernel-side console rendering.)

Risks (much smaller than the previous framing):

  • Marker placement and clearing: a marker that sticks across reboots permanently traps the system in recovery mode. The clearing logic must be correct: clear on entering a recovery session, not on completing one (so an unclean exit from recovery still leaves the marker valid for retry).
  • Alt-detection placement: too early (loader level) means a separate keyboard path that doesn't use inputfs; too late (post-pgsd-sessiond) misses boot failures that prevent reaching that stage. The ADR settles where this lives.
  • Recovery session itself being broken: the recovery session uses the same drawfs and inputfs substrate as normal sessions. If those are broken badly enough to prevent rendering a login prompt, AD-11 cannot help and external media is the answer. This is acknowledged in the ADR rather than worked around.

What this entry does not claim:

  • It does not claim vt(4) is broken. vt(4) works correctly within its design; PGSD has removed it only because UTF does not need it as a session console once SM-1 and AD-11 are both in place.
  • It does not commit UTF to owning the boot console (early kernel printf), the panic console (KDB output post-fault), or the system console of last resort. Those remain platform transport, like inputfs's treatment of early kbdmux keyboard.
  • It does not commit to taking over single-user mode in the FreeBSD-init sense. Single-user mode is superseded by the recovery session profile; operators who need a recovery shell get one via path 2 or path 3 above.

Historical record:

  • 2026-05-04: AD-11 first discussed, prompted by AD-10 framing. Original scope was "should UTF own the console itself, on the same discipline grounds that motivated inputfs / drawfs / audiofs?"
  • 2026-05-10: Option Y decision. UTF commits to a native login and session path (SM-1) but does not commit to kernel-side console takeover. AD-11 becomes "retire vt(4) even for recovery" as a narrower question.
  • 2026-05-13/14: AD-39 lands. vt(4) is compiled out of the PGSD kernel entirely. The "compete with vt(4) for the framebuffer" problem evaporates by construction.
  • 2026-05-21: AD-11 reframed under the "convergent paths" model. Recovery becomes a session profile, not a separate console layer. Scope drops from Large to Medium. This entry as currently written replaces the previous Option-Y entry.

[~] AD-18: drawfs locking-discipline fixes from DF-4 audit (.1–.6 done 2026-05-08; .7 deferred to DF-6, re-tagged 2026-05-21)

Tracks: drawfs/sys/dev/drawfs/drawfs.c, drawfs/sys/dev/drawfs/drawfs_surface.c, drawfs/sys/dev/drawfs/drawfs_drm.c. Audit findings recorded in docs/DF4_VERIFICATION.md section "Findings".

The DF-4 static audit (docs/DF4_VERIFICATION.md) walked every lock acquisition site in the drawfs kernel module and found seven WITNESS-detectable bugs. None affect the substrate's operational behaviour under the release kernel; all surface under WITNESS or under the specific race conditions WITNESS is designed to catch.

Status 2026-05-08, re-tagged 2026-05-21: AD-18.1 through .6 are done and bench-verified under PGSD-DEBUG. AD-18.7 (DRM PAGE_FLIP path) is structurally deferred to DF-6 since the calling site (KMS page-flip ioctl wiring) does not yet exist; the fix design is captured but cannot be implemented or verified until DF-6 wires drawfs_reply_surface_present to drawfs_drm_surface_present. The previous deferral target of DF-3 was imprecise because DF-3 closed as a compile-clean skeleton without that wiring.

The seven fixes:

  • AD-18.1 (Done 2026-05-07): recursive s->lock acquire via surface_lookup call from inside find_session_for_surface_locked. Fixed by adding a drawfs_surface_lookup_locked variant in drawfs_surface.c that asserts s->lock is held and walks the list without acquiring; switched drawfs_find_session_for_surface_locked to use it. The public drawfs_surface_lookup is preserved for callers that don't hold the lock (e.g. drawfs_reply_surface_present) and now delegates to _locked to avoid duplicated walk code.

  • AD-18.2 (Done 2026-05-07): vm_pager_allocate with s->lock held in surface_get_vmobj. Fixed by pinning the selected surface's id and bytes_total under the first lock-hold, releasing the lock for vm_pager_allocate, then re-acquiring to re-find the surface by id and either install (we won) or yield to a concurrent installer (deallocating our redundant vm_object outside the lock). Added hw.drawfs.vmobj_install_lost sysctl to count install-race losses; should remain 0 on single-threaded workloads. Surface bytes_total is immutable post-create and ids are monotonic (never reused), so the pinned values stay correct across the unlocked window.

  • AD-18.3 (Done 2026-05-07): malloc(M_WAITOK) with s->lock held in input buffer growth path. Fixed by rewriting drawfs_ingest_bytes as a loop with drop-and-retry around the M_WAITOK malloc. Each iteration takes the lock, decides whether to fast-path (existing in_cap fits), install a pre-allocated buffer (if any), or compute a new newcap and drop lock to allocate. Loop bound is log2(MAX_FRAME / initial cap) ≈ 8 iterations worst case, almost always 0 or 1 in practice. Added hw.drawfs.inbuf_grow_race_lost sysctl to count races where our pre-allocated buffer was unneeded after re-acquire (another writer grew us, or process_inbuf consumed enough to make room).

  • AD-18.4 (Done 2026-05-07): same shape as AD-18.3, frame extraction path. Fixed by rewriting drawfs_try_process_inbuf with drop-and-revalidate around the per-frame extraction malloc. The lock is released for the malloc, then re-acquired; the frame at the head of inbuf is re-validated by header memcmp against the pinned copy. If validation fails (another extractor consumed our frame, or the session is closing), drop our buffer and retry the loop. Added hw.drawfs.frame_extract_race_lost sysctl. Both AD-18.3 and AD-18.4 land in the same commit since they share the drawfs_write codepath, the same fix shape, and the same file.

  • AD-18.5 (Done 2026-05-08): unprotected stats updates, five sites in total. The originally-filed site at drawfs.c:639 (now line 681 after AD-18.1-.4 line shifts) was s->stats.bytes_in += n in drawfs_write, completely outside any lock. Audit of the file revealed four more unprotected updates in the same family: frames_invalid and frames_processed in drawfs_try_process_inbuf (after mtx_unlock for validation/process calls), and messages_processed and messages_unsupported in drawfs_process_frame (no lock at all). All five violated the documented locking-model invariant (drawfs.c:218-235: "Statistics counters (stats.*) protected by s->lock"). Fixed by:

    • moving the bytes_in increment into drawfs_ingest_bytes where the lock is already held (with a counted flag to prevent double-counting on grow-race retries; minor semantic refinement: closing sessions no longer count rejected bytes);
    • wrapping each of the other four updates in a take-update-release pattern (mtx_lock; stats.X++; mtx_unlock; reply_call()) so the stats update is serialized but the reply call (which itself takes the lock via drawfs_enqueue_event) is not nested. Also expanded the locking-rules comment to document the take-update-release pattern as the canonical idiom for stats updates around reply calls. WITNESS does not directly catch this kind of data race (no lock-order violation), so verification is by code review against the invariant; absence of any unprotected s->stats. site outside s->lock confirms the fix.
  • AD-18.6 (Done 2026-05-08): surface-list teardown without s->lock in drawfs_surfaces_free_all. The function walked s->surfaces and called TAILQ_REMOVE outside any lock, and updated s->map_surface_id, s->surfaces_count, s->surfaces_bytes partly outside the lock. All violated the locking-model invariant (drawfs.c:218-235: "Surface list (s->surfaces) is also protected by s->lock" and "Statistics counters / session state under s->lock"). Latent in practice because by the time priv_dtor invokes this function, the session has already been removed from the global registry (drawfs.c:904-906) and no concurrent access is possible. Fixed by restructuring the loop into the standard drop-lock-around-vm_object_deallocate pattern: each iteration takes s->lock; if the surface list is empty, snaps stat drift to zero and returns; otherwise unlinks the head surface, captures its vmobj, decrements counters, all under the lock; releases the lock; calls vm_object_deallocate; free()s the surface struct. Defense-in-depth fix; signed off by code review against the documented invariants.

  • AD-18.7: drm_ioctl_kern with dd->drm_mtx held in DRM PAGE_FLIP path. Currently latent (the function drawfs_drm_surface_present is unreached; DF-6 has not yet wired SURFACE_PRESENT to dispatch into it). Capture flip parameters under lock, drop, ioctl, re-acquire, install. Fix should land in the DF-6 wiring commit.

Sequencing: each sub-stage is independent and can land in any order. Recommended order is by audit number — AD-18.1 first because it's the most clearly fatal (recursive acquire on non-recursive mutex), then the M_WAITOK bugs as the most common sleep-with-lock-held class.

Verification: ideally each sub-stage is verified by the DF-4 WITNESS run going green for that specific finding. In the absence of a debug kernel, code review against the documented locking invariants (drawfs.c:182-198) is the available bar.

Why filed as a separate AD rather than rolled into DF-4: DF-4 is the verification work. AD-18 is the fix work that DF-4's audit identified. Keeping them separate lets DF-4 close when verification runs (and findings are confirmed) without needing to wait on every fix to land first.

Discovered: DF-4 static audit, 2026-05-05.

[ ] AD-25: cursor motion smoothness (Open, Medium-Large; reframed 2026-05-10 after instrumentation)

Reframed 2026-05-10. Instrumentation patch landed (the ad25_diagnostic event emitted from semadraw/src/compositor/compositor.zig per composite cycle when UTF_COMPOSITOR_INSTRUMENT=1 is set in semadrawd's environment). Bench-collected with cursor in steady motion on pgsd-bare-metal-test-machine. The original two hypotheses below were both ruled out; the actual bottleneck lives elsewhere. The original entry is preserved at the end of this section for historical context.

Discovery plan landed 2026-05-12 as semadraw/docs/adr/0007-ad25-cursor-motion-discovery-plan.md. The ADR commits to three instrumentation rounds: pump cadence (U1), render-phase breakdown (U2), and a poll-timeout experiment (U3), with explicit decision criteria for each. Fix ADR(s) are deferred to 0008+ once round data is in hand. This BACKLOG entry remains the operational tracker; ADR 0007 is the long-term record of the discovery framing.

Round 1 findings recorded 2026-05-12. Round 1 instrumentation landed in commit 6d670b9 (pump_diagnostic event in pumpCursorPosition, gated on UTF_PUMP_INSTRUMENT=1) and was bench-collected on pgsd-bare-metal-test-machine. The captured 148 ms log window (limited by current s6-log retention) showed 9,998 pump events at an average rate of ~67,500 events/second (14.8 us per pump), with 1 of 9,998 events reporting pos_changed:true.

This finding invalidates the ADR's original framing of Round 1's question. The main loop is not paced by the 100 ms posix.poll timeout during cursor motion; the /dev/draw pollable fd shows readable continuously while inputfs is active, so the timeout never bites. The pump rate of 67 kHz is bounded by per-iteration loop work, not by any deliberate cadence floor. Composite cadence (still ~8.7 Hz per the 2026-05-10 bench) is gated by needsComposite() returning false on the vast majority of iterations, not by the loop being slow.

Round 2 redirects from "render-phase breakdown" to composite-gate instrumentation per the ADR's addendum. The 67 kHz busy-spin finding is broader than AD-25 and has been opened as a separate track; see AD-32. Round 1 instrumentation remains in the tree, gated on the env var, for future regression checks.

Round 2 findings recorded 2026-05-12. Round 2 instrumentation landed in commit b345984 (composite_gate_diagnostic event from Compositor.needsComposite(), gated on UTF_COMPOSITE_GATE_INSTRUMENT=1) and was bench-collected on pgsd-bare-metal-test-machine with both UTF_PUMP_INSTRUMENT=1 and UTF_COMPOSITE_GATE_INSTRUMENT=1 active. The captured 73.6 ms log window showed 9,998 needsComposite calls with the following gate-state distribution:

  • has_damage:true, should_composite:true: 1 (0.01%)
  • has_damage:true, should_composite:false: 0
  • has_damage:false, should_composite:true: 7,291 (72.9%)
  • has_damage:false, should_composite:false: 2,706 (27.1%)
  • state_valid:false: 0

Findings: the FrameScheduler is not the gate (should_composite returned true on 72.9% of calls). The damage path is wired correctly (the 1 has_damage:true event corresponds 1:1 with the 1 pos_changed:true pump_diagnostic event from the same session). Composite is gated because the pump observed pos_changed:true only once in 9,998 reads.

But: independent observation of inputfs's state region via inputdump state --watch --interval-ms 50 shows the pointer position changing at ~130 Hz during the same bench conditions; inputdump events --stats reports ≥204 events/sec on the event ring. The pump and inputdump are both consumers of the same mmap'd state file, opened through the same StateReader code path; yet they observe radically different update rates.

This mmap-visibility question is opened as a separate track; see AD-34. AD-25 stays open as the umbrella tracker; the "cursor motion smoothness" symptom will not resolve until AD-34's question is resolved (plus any follow-up). Round 2 instrumentation remains in the tree, gated on the env var, for future regression checks.

What the instrumentation showed

Sample lines from /var/log/utf/semadrawd/current during steady cursor motion:

{"type":"ad25_diagnostic","frame":233,"clear_calls":6,
 "clear_px":3456,"clear_ns":119617,
 "full_entry":false,"full_clearpath":false,
 "surfaces_rendered":1,"render_ns":8088460}
{"type":"ad25_diagnostic","frame":234,"clear_calls":2,
 "clear_px":1152,"clear_ns":35764,
 "full_entry":false,"full_clearpath":false,
 "surfaces_rendered":1,"render_ns":7817949}

Findings, in priority order:

  • full_entry and full_clearpath are false on every sampled frame. The compositor is not promoting to full repaint during cursor motion. Hypothesis (a) below is ruled out.
  • clearRegion cost is small. The 2-call frames spend ~35,500 ns total in clearRegion for 1152 pixels (~30 ns/px, ~17.8 us per call). The 6-call frames spend ~120,000 ns for 3456 pixels (the same per-pixel cost; the variation is in call count, not per-call cost). Hypothesis (b) below is mostly ruled out as a cause of perceived unsmoothness. A per-pixel optimization is real future work but would not move the needle on smoothness.
  • render_ns is the dominant per-frame cost. The render loop spends ~7,800,000 ns (~7.8 ms) per composite cycle to render a single surface. That is roughly 70x more expensive than the entire clearRegion work and consumes a large fraction of any reasonable per-frame budget.
  • Inter-frame gap is ~115 ms (~8.7 Hz). Consecutive ad25_diagnostic events' ts_wall_ns deltas land at ~114-115 million ns. At 8.7 Hz the cursor visibly steps rather than glides, regardless of clearRegion cost.
  • clear_calls cycles between 2, 4, and 6 per cycle. Two rects per cursor pump tick (old and new); when multiple pump ticks queue between composites, the rects accumulate (4 = two ticks queued, 6 = three). Confirms the pump is firing faster than the compositor is consuming.

Real bottleneck

Two issues compound, neither named in the original entry:

  • Composite cadence is ~8.7 Hz. Between consecutive composites, ~115 ms passes. At that cadence, the cursor visibly chunks. Whether this is the scheduler holding off composite, the cursor pump rate, or the render itself bounding the cadence, is not yet measured.
  • Per-composite render cost is ~7.8 ms for one surface. That single surface (a fullscreen semadraw-term) is being fully re-rendered every time the cursor moves a single pixel. The cursor's region damage propagates to the term as surface damage; the term re-renders its entire buffer; the compositor presents. 7.8 ms on a 16.6 ms 60 Hz frame budget is ~47% of one frame; on a 9 ms 110 Hz budget that earlier frame_complete data showed, it is most of the frame. A per-character or per-region damage path inside semadraw-term would let small-region cursor motions skip the term's full re-render, which is probably the largest single win available.

These are both substantially larger pieces of work than "memcpy the clear loop." The original Small-Medium estimate is revised upward.

Open questions for the next diagnostic round

Before designing a fix, three measurements are still needed:

  • Cursor pump cadence in isolation. Instrument pumpCursorPosition() to log every invocation with timestamp and whether the position actually changed. If the pump fires at >=60 Hz but composite still runs at 9 Hz, the scheduler is the bottleneck. If the pump itself fires at 9 Hz, the inputfs side is the bottleneck and AD-25 is misdiagnosed (the input substrate is slow, not the compositor). The clear_calls cycling between 2 and 6 suggests the pump is faster than composite, but a direct measurement would confirm.
  • Scheduler cadence. What is FrameScheduler actually targeting and what does it return when asked? The 110 Hz target seen in earlier frame_complete events versus the 8.7 Hz observed during cursor motion is suspicious; is the scheduler holding off composite while damage is pending?
  • Render-cost breakdown. Where in the render path do the 7.8 ms go? Is it SDCS interpretation, blit to framebuffer, font rasterization in the term's render cycle, or something else? A per-phase timing inside the backend's render would tell us whether the semadraw-term render fix is in the term itself or in the backend.

Revised estimate

Medium-Large, not Small-Medium. Both real bottlenecks require design work:

  • Per-region damage inside semadraw-term so it can re-render only the cells the cursor's old/new rects intersect, not the whole buffer. Estimated own-ADR work, likely Medium.
  • Compositor scheduler review to understand the gap between target Hz and observed cadence. Possibly small if the cause is a single-line bug; possibly large if the scheduler model needs revision.

These two items may want their own BACKLOG entries (AD-25.1 and AD-25.2) once their scopes are clearer. AD-25 itself remains Open as the umbrella tracking item until the underlying work is enumerated and assigned.

Status note on the instrumentation patch

The instrumentation lives in semadraw/src/compositor/compositor.zig (the UTF_COMPOSITOR_INSTRUMENT env-var-gated path) and semadraw/src/daemon/events.zig (the emitAd25Diagnostic typed emitter). Zero runtime cost when the env var is unset. The instrumentation is intended to remain in place until the underlying issue is understood and fixed; remove only after AD-25 closes.


Original entry, preserved for historical context:

Surfaced during AD-21 sub-item 9 verification: with the region-damage fix in place, the cursor follows the pointer correctly but motion is described as "not smooth." Two candidate causes worth profiling before optimising:

(a) The fallback path firing more than expected. When a backend doesn't implement clearRegion, the compositor promotes to full repaint via markFullRepaint(), which clears the entire framebuffer and re-renders every visible surface. At 3840×2160, full repaint is expensive even for a 24×24 cursor move. drawfs and software backends both implement clearRegion, so this should never fire on the bench — but worth confirming with a log.debug count on the fallback branch.

(b) Per-pixel loops in clearRegionImpl. Both backends write pixels one at a time in a tight nested loop. For a 24×24 rect that's 576 pixel writes per region, two regions per cursor move (old + new), so up to 1152 pixel writes per move. At 60 Hz cursor motion that's ~70k pixel writes per second — should be negligible, but worth measuring; on slow EFI framebuffers the per-pixel loop may stall on write-combine flushes or similar.

Fix candidates (after profiling identifies the actual bottleneck):

  • Memcpy-based clear: precompute the row of background pixels once, memcpy that row across each scanline of the rect. ~3-5x speedup on x86-64 vs per-pixel loops.
  • Region damage rect coalescing: if the cursor moves twice within a composite cycle (rare but possible at >60Hz input rates), the pump emits four rects (two old, two new). Coalescing overlapping or adjacent rects into a single bounding rect would save clearRegion calls.
  • Surface-walk damage propagation: currently the pump walks all visible surfaces via getCompositionOrder on every cursor move. With many surfaces this is O(N×2); a spatial index would make it O(log N) but adds structure complexity.

Estimate Small-Medium: profiling + one of the above is small; multiple of the above is medium. Defer until a user reports the smoothness as an actual problem; the current behaviour is correct.

[ ] AD-27 (pre-fix narrowed-scope diagnosis): trackpad pointer updates not reaching cursor surface (Was Reopened 2026-05-08 with narrowed scope after AD-30.1; Small-Medium)

Status update (2026-05-08, post-AD-30.1). AD-27 was filed 2026-05-07, superseded 2026-05-08 by AD-30 (because inputfs was attaching to zero HID devices, making the trackpad question unobservable), and now re-opens with a precisely localised scope after AD-30.1 restored inputfs's HID attachment.

The bench state after AD-30.1:

  • inputfs2 attached the HAILUCK touchpad at hidbus2 with roles=pointer,touch. Diagnostic dmesg lines:

    inputfs2: <HAILUCK CO.,LTD USB touchpad inputfs HID device> on hidbus2
    inputfs2: inputfs: descriptor 505 bytes, 48 input items, 0 output, 1319 feature, depth=2
    inputfs2: inputfs: pointer locations cached (x=yes y=yes wheel=yes buttons=1 count=6)
    inputfs2: inputfs: digitizer locations cached (report_id=7 tip=yes x=yes y=yes confidence=yes contact_id=yes scan_time=yes contact_count=yes button=yes x_range=[0..1535] y_range=[0..1023])
    inputfs2: inputfs: roles=pointer,touch
    inputfs2: inputfs: Device Mode set to MT Touchpad (report_id=11 rlen=2)
    

    All HUP_DIGITIZERS fields located. Device successfully flipped from Mouse Mode to Multi-touch Touchpad Mode (the AD-1 step 5 feature-report write succeeded; report_id=11 rlen=2).

  • Touch events are produced. inputdump events during trackpad-only motion shows a stream like:

    seq=542 ts=... dev=2 touch.type2
    seq=543 ts=... dev=2 touch.type2
    ...
    seq=557 ts=... dev=2 touch.type3
    

    touch.type2 is INPUTFS_TOUCH_MOVE (constant at inputfs.c:479); touch.type3 is INPUTFS_TOUCH_UP. The dispatcher at inputfs.c:2849-2902 is firing correctly: each Report ID 7 packet decodes into a touch event.

  • No pointer.motion events from dev=2. During the same trackpad-only motion window, no dev=2 pointer.motion lines appear on the event ring. By contrast, dev=0 (the ELECOM external mouse) produces pointer.motion x=N y=M dx=N dy=M buttons=0x0 session=... correctly — confirming the publish path is alive for HUG_MOUSE devices.

  • State region pointer slot unchanged. Two consecutive inputdump state | head -8 snapshots flanking a 5-second trackpad motion window show identical output: last_seq: 234 pointer: x=1036 y=989. The trackpad's TOUCH_MOVE events advanced the event-ring sequence but did not advance the state-region sequence (which only increments when the pointer slot is touched).

  • Cursor sprite does not move under trackpad. The direct visual: external mouse moves the cursor fine; trackpad does not.

Diagnosis

The touch dispatcher emits INPUTFS_TOUCH_DOWN, INPUTFS_TOUCH_MOVE, INPUTFS_TOUCH_UP to the event ring via inputfs_events_publish, with payloads containing the contact's pixel coordinates and session id. It does NOT also synthesize a pointer.motion event for the cursor-control case, and it does NOT update the state region's pointer slot. Single-finger touch — the canonical "touchpad as cursor" interaction — therefore reaches userland touch consumers (libsemainput's recogniser, future gesture tools) but not the cursor pump (semadrawd) which reads only the state region's pointer slot.

The fix is in the touch dispatcher at inputfs/sys/dev/inputfs/inputfs.c:2849-2902. The shape:

  1. Maintain an active_contact_count field on struct inputfs_softc (or compute it from the sc_touch_contacts[] array when needed).
  2. After updating sc_touch_contacts[cid] in either the touch_down, touch_move, or touch_up branch, count active contacts.
  3. When active_contact_count == 1: this is the single-finger-touch case. Synthesize pointer motion: compute deltas from the previous primary contact's last_x/last_y to the current contact's px/py; call inputfs_state_update_pointer(dx, dy, synthetic_buttons, &actual_dx, &actual_dy) under the seqlock; emit a pointer.motion event with payload (new_x, new_y, actual_dx, actual_dy, synthetic_buttons, session). The synthetic buttons mask comes from the touchpad's button-pad state (the button=yes field in the digitizer locations).
  4. When active_contact_count == 0: contact lifted. No pointer motion to synthesise; emit touch_up only (current behaviour).
  5. When active_contact_count > 1: multi-finger gesture in progress. Suppress pointer motion; libsemainput's gesture recogniser owns these interactions at the userland layer. (Current behaviour is fine for this case — touch events keep flowing without polluting cursor state.)
  6. Transition handling. When count goes 1 → 2, a second finger has touched down; cursor should freeze in place rather than jump to the new finger's coordinates. When count goes 2 → 1, the user lifted one finger; cursor should resume tracking the remaining finger from its CURRENT position (no jump). The implementation uses a "primary contact" pointer that points at the first active contact; on transition, the primary contact is re-resolved before the next pointer.motion synthesis.

Why this didn't bite earlier

ADR 0018 §3 documents the touchpad-mode flip and the HUP_DIGITIZERS parser thoroughly. Section 4 describes the per-contact event-emission model (touch_down / touch_move / touch_up). Section 4 does NOT cover "touchpad as cursor" — the cursor-motion synthesis path was implicitly assumed to live downstream (originally semainputd; later libsemainput). With semainputd retired (AD-2a) and libsemainput consumed only by semadrawd's gesture path (not its pointer pump), there is no surface left that synthesizes pointer motion from touch events, and the kernel needs to do it directly to keep the cursor responsive.

This is a design completion, not a bug regression — the kernel never had this code, and the userland consumer that used to do it is gone. The fix lives in the kernel because that's where the event-time information is freshest and the synthesis is cheapest.

Alternative considered: do the synthesis in semadrawd

Have semadrawd's cursor pump read both the state region's pointer slot AND the event ring's recent touch events, integrating touch_move deltas into a "derived pointer" that overrides the state region's pointer when present. The kernel would not need to change.

Rejected because:

  • Two paths to the cursor's authoritative position (state region for mouse; derived from event ring for touchpad) is more state to manage in the pump and harder to reason about.
  • Latency: the kernel writes the state region every HID interrupt; userland would have to drain the event ring and integrate deltas at compositor frame rate, adding ~16ms of cursor lag versus the kernel doing it in the interrupt handler.
  • Multiple state-region readers: inputdump state becomes lying about cursor position when the cursor is touchpad-driven; userland tools that read state for cursor coordinates (current and future) need a special-case path.
  • The state region is the documented authoritative source per ADR 0007; making the event ring co-authoritative weakens that guarantee.

The kernel-side synthesis is the right place. Touch consumers (libsemainput) keep getting their TOUCH_DOWN/MOVE/UP events unchanged; cursor consumers (semadrawd's pump) keep reading the state region as they always have. Both surfaces remain single-authoritative.

Closure criteria

  1. inputdump events during single-finger trackpad motion shows interleaved dev=2 touch.type2 and dev=2 pointer.motion events.
  2. inputdump state shows the pointer slot advancing during trackpad motion (last_seq and pointer x/y both updating).
  3. The cursor sprite follows single-finger touchpad motion on the framebuffer.
  4. Two-finger gestures (e.g., pinch, scroll) do NOT move the cursor; only touch.type* events fire.
  5. The mouse-driven cursor still works (no regression in the existing pointer.motion publish path).
  6. WITNESS-clean (no lock-order violations introduced by the touch-side state-region update).
  7. No drawfs counter regression (vmobj_install_lost, inbuf_grow_race_lost, frame_extract_race_lost remain 0; semadrawd uptime climbs steadily).

Estimate

Small-Medium. The change touches one function (inputfs.c:2849-2902 plus a new "primary contact" field on softc), adds a counter, and synthesizes a pointer.motion event using the existing inputfs_events_publish and inputfs_state_update_pointer helpers. No new data structures, no ADR amendment needed (the design is consistent with ADR 0018 §3 and ADR 0007). The test surface is the existing fuzz harness (HUP_DIGITIZERS descriptors) plus bench verification. ~150 lines of kernel C; ~50 lines of test scaffolding. One bench reboot for verification.


Pre-supersede entry below; preserved for the investigation history and the ad22-diagnose.sh diagnostic record. The 2026-05-07 framing is superseded by the 2026-05-08 narrowed-scope diagnosis above.

[ ] AD-32: semadrawd main loop busy-waits at ~67 kHz (Open, Medium; surfaced 2026-05-12)

Surfaced by AD-25 Round 1 instrumentation (ADR 0007 addendum, commit 6d670b9). The daemon's main run loop iterates at roughly 67,500 iterations per second on the bench (pgsd-bare-metal-test-machine, drawfs backend, single client connected via Unix socket, semadraw-term --fullscreen running). Each iteration runs pumpCursorPosition, pumpCursorFocus, comp.pollEvents, and the needsComposite + composite block, then re-enters posix.poll.

Observation

Pump-event cadence measured directly:

  • Average inter-event delta: 14.8 us (across 9,998 captured events spanning 148 ms wall time).
  • Minimum inter-event delta: ~4 us (back-to-back iterations doing no work).
  • The loop is not gated by the posix.poll(timeout=100) floor. /dev/draw (the drawfs backend's pollable fd) shows readable continuously while inputfs is delivering events, so poll returns immediately with n > 0 rather than timing out.

The 8.7 Hz composite cadence observed in earlier benches (2026-05-10) is not caused by the loop being slow. The loop is fast; needsComposite() returns false on the vast majority of iterations. That smoothness-relevant question is tracked under AD-25 Round 2, separately.

This entry is concerned with the loop's busy-spin in its own right, not its effect on cursor smoothness.

Implications

  • CPU. Continuous main-loop activity consumes CPU even when there is no productive work. On a desktop bench this is invisible against other load; on the sparrow laptop (1024x768 portable target, battery-powered) the impact is meaningful.
  • Power. A CPU that never reaches a deep idle state burns power proportional to the loop's per-iteration cost.
  • Cache. Each iteration reads the inputfs state mmap. If the state region's cache lines ping-pong between cores (inputfs writer on one core, semadrawd reader on another), the apparent cost of the loop may include cache-coherence traffic that does not show up as user-CPU time.

Open questions

  • What is each iteration doing? The 14.8 us average is larger than a "do nothing" loop should be on modern x86; something is consuming time per iteration. Is it the mmap read in pointerSnapshot, the getCompositionOrder walk in pumpCursorPosition's visibility check, the comp.pollEvents drain, or the posix.poll syscall itself returning immediately? A per-phase timing inside one loop iteration would settle this.
  • Why isn't posix.poll(timeout=100) reaching its timeout? If /dev/draw shows readable continuously, that means the drawfs event ring has pending events not being drained, or the ring's readable-condition isn't clearing after a drain. Either is a bug in drawfs or the backend's pollEvents consumer.
  • Is the cost of pumpCursorPosition itself significant? At 67 kHz the pump runs ~67,000 times per second even when nothing has changed. The change-detection fast-paths return early, but the cost of reaching them (mmap read, hotspot subtract, fb-dims fetch, two visibility-comparison branches) is not free.

Next steps

A discovery round similar to AD-25's, but targeting the loop itself rather than the cursor pipeline. Likely sequence:

  • D1 (instrumentation): add a loop-iteration counter and per-phase timing inside one iteration, gated on a new UTF_LOOP_INSTRUMENT env var. Sample every Nth iteration to keep log volume bounded.
  • D2 (data collection): bench-run with the new instrumentation under the same conditions that surfaced the 67 kHz finding.
  • D3 (analysis and ADR): based on findings, write an ADR for the fix. Candidate fix directions, none of which are committed to here:
    • Replace posix.poll(100) with a deadline-driven sleep using the existing FrameScheduler.getTimeUntilDeadline() plus an event-aware wake mechanism.
    • Add an eventfd-style wake from inputfs so the pump only runs when the cursor actually moved.
    • Investigate why /dev/draw is continuously readable (the underlying drawfs event ring may have a level-triggered readable bit when it should be edge-triggered, or the consumer may not be draining to empty).

Estimate

Medium. Discovery sequence is roughly two to four hours. The actual fix depends entirely on D3 findings; could be small (one-line drain fix in drawfs) or medium (loop-pacing redesign with a new wake mechanism). A dedicated ADR will follow D3.

Related

  • AD-25 (cursor motion smoothness): sibling track, surfaced concurrently from the same Round 1 bench. AD-25 owns the composite-gating question; AD-32 owns the loop-pacing question.
  • ADR 0007 addendum (2026-05-12): records the bench finding that surfaced this entry.

[ ] AD-34: pump's mmap view of inputfs state appears stale vs inputdump (Open, Medium; surfaced 2026-05-12)

Surfaced by AD-25 Round 2 (ADR 0007 second addendum, commit b345984). The cursor pump in semadrawd reads pointer state from /var/run/sema/input/state via mmap(MAP_SHARED, PROT_READ). The pump fires at ~67 kHz; inputfs publishes state updates at ≥130 Hz under continuous cursor motion (as observed independently via inputdump state --watch --interval-ms 50). The pump should therefore see a new position on a meaningful fraction of reads. It does not.

Observation

Bench cycle on pgsd-bare-metal-test-machine, single semadraw-term --fullscreen client, continuous cursor motion for ~10 seconds, both UTF_PUMP_INSTRUMENT=1 and UTF_COMPOSITE_GATE_INSTRUMENT=1 set. Captured window (73.6 ms, log-rotation-limited):

  • pump_diagnostic events: 9,998.
  • pos_changed:true events: 1 (0.01%).
  • state_valid:false: 0.

Same bench, inputdump state --watch --interval-ms 50 observes the pointer position changing every 50 ms poll across the full bench duration (~33 snapshots, all marked "changed", pointer coordinates spanning x ∈ [1600, 3000], y ∈ [800, 1600] across the 3840×2160 framebuffer).

The two consumers use the same StateReader.init path in shared/src/input.zig: same mmap flags, same seqlock discipline, same x/y offsets.

Implications

This is the only unresolved gate between AD-25's "cursor motion smoothness" symptom and its resolution. Round 2 ruled out all other candidates (clearRegion cost, full-repaint, poll-timeout pacing, FrameScheduler gating, damage propagation). If the pump saw position updates at inputfs's actual publication rate (~130-200 Hz), composite would fire at the same rate, and the smoothness complaint would dissolve.

Whatever this is, it is also blocking AD-25 from closing.

Open questions

  • What raw ps.x, ps.y values does the pump actually read across iterations? The current pump_diagnostic event reports the boolean result of change detection but not the underlying values. Adding ps.x, ps.y to the event payload would directly answer "is the pump seeing the same value repeatedly, or different values that compare-equal somehow?"
  • Is the mmap view consistent with the file content via read(2)? A test that mmap's the state file and read()'s it simultaneously, comparing the two views, would distinguish "mmap pages are stale" from "the file's bytes are themselves stale."
  • Does the pump need higher-precision pos_changed detection? Current logic compares f32 surface positions against cached f32 values. The conversion i32 → f32 → != is exact for integer pixel values within ±2^24, which the bench's 3840×2160 framebuffer is well inside. Not a likely cause, but listed for completeness.
  • Is the kernel's vn_rdwr write of the state file coherent with userspace mmap reads at this rate? The tmpfs vnode is the same VM object as the mmap; in principle writes should be visible immediately. If they are not (e.g., write goes through one cache layer while mmap reads go through another), this is the mechanism.

Findings (2026-05-12)

E1 instrumentation (commit applies this session) added raw ps.x, ps.y fields to the pump_diagnostic event. Bench cycle on pgsd-bare-metal-test-machine with continuous cursor motion for ~10 seconds, both UTF_PUMP_INSTRUMENT=1 and the E1 patch active.

Bench results:

  • Total pump_diagnostic events: 39,992.
  • Unique (ps_x, ps_y) tuples among state_valid=true events: 1 (the value (461, 273)).
  • First and last tuples: identical, (461, 273).
  • state_valid=false: 0.

The pump read the same value 39,991 times in a row during a bench cycle where the cursor was visibly moving across the 3840×2160 framebuffer. The mmap view is fully stale: it freezes at whatever value was published at mmap-open time and never updates again.

Cross-process probes (same bench session):

  • sudo inputdump state --watch --interval-ms 50 (running as root): sees pointer position change every poll, covering positions across the full framebuffer (x ∈ [1960, 3337], y ∈ [106, 1641]). Live state visible.
  • sudo -u _semadraw inputdump state --watch --interval-ms 50 (running as the _semadraw system user, uid 1002, the same uid semadrawd runs as after privilege drop): sees only the initial === snapshot === block. No === changed === blocks fire during ~10 seconds of cursor motion. Same staleness symptom as the pump.
  • sudo -u _semadraw xxd /var/run/sema/input/state taken twice with a cursor move in between: bytes differ at the pointer x/y offsets and the last_seq offset. _semadraw can read fresh data via read(2) but not via mmap(MAP_SHARED, PROT_READ).

Root cause localised: the staleness is specific to mmap(MAP_SHARED, PROT_READ) opened by a non-root group member against a tmpfs file that the kernel is writing via vn_rdwr(IO_SYNC). Root mmaps of the same file work correctly. The same _semadraw process gets fresh content via read(2) on the same fd; only mmap is affected.

Ruled out by these findings:

  • Pump-side bugs (f32 conversion, hotspot subtraction): the 1 unique tuple result means there's nothing to convert or subtract; raw integer bytes are identical across reads.
  • Inputfs not publishing: it is publishing, demonstrably, to anyone reading via read(2).
  • Generic mmap staleness: it works for root.

Standing question: why does FreeBSD 15 tmpfs+vn_rdwr not invalidate mmap pages held by a non-root credential when it does for root? This is a kernel-side question for which the answer requires reading tmpfs/vm internals. The specific issue is recorded in docs/FREEBSD_ISSUES.md ("Issue #1: tmpfs mmap staleness for non-root group members") for future reference.

Implication for AD-25: the "cursor motion smoothness" symptom is fully explained. The pump cannot see fresh positions; composite cannot fire on damage that wasn't marked; damage cannot be marked because the change-detection sees no change. The fix direction is now a design question, not a measurement question; see AD-35.

Follow-up finding (2026-05-12, during AD-35 D1)

While drafting ADR 0008 to evaluate the three fix directions, a probe of the event-ring file as _semadraw was run as a sanity check (sudo -u _semadraw inputdump events --watch --interval-ms 50). It returned live pointer.motion events streaming at ~110 events/sec, with absolute x/y coordinates updating across the framebuffer. The event ring uses the same mmap(MAP_SHARED, PROT_READ) primitive against a tmpfs file written by the same kernel kthread via vn_rdwr(IO_SYNC) -- structurally identical to the state region's broken read path -- yet works correctly for _semadraw's mmap.

This narrows the AD-34 localisation. The bug is not the generic "tmpfs + mmap + non-root credential" combination as originally characterised; it is that combination plus the access pattern of the state-region writes. Two candidate distinguishing factors:

  • The state region's writes always overwrite the same pages (whole-buffer rewrites of a ~5 KB region) on every sync. The event ring writes per-slot at varying offsets across a ~65 KB file, hitting different pages over time.
  • The state region is small enough (~5 KB) that a single write touches a small number of pages repeatedly. The ring is large enough that the bulk of writes land on pages that have not been recently touched.

The current best guess (not verified) is that vm_object_page_clean or its equivalent is effective only on pages freshly dirtied since the previous sync, and the state-region's repeated rewrites of the same pages somehow fail to mark those pages dirty in a way that invalidates non-root mmaps.

AD-34 stays open as a kernel-side investigation track; the narrower characterisation makes a future fix more tractable. The UTF-side workaround chosen in ADR 0008 (Direction 2, event-ring consumption) avoids the broken access pattern entirely and does not depend on the kernel-side fix.

Next steps

A focused discovery round, narrower than AD-25's Round 2. Likely sequence:

  • E1 (instrumentation): extend the pump_diagnostic event with raw ps.x, ps.y values. Bench-collect. Determine whether the pump is reading stale identical values or reading varying values that the comparison misses. Instrumentation landed 2026-05-12 (this commit); bench- collection pending. The pump_diagnostic event payload now includes ps_x: i32 and ps_y: i32 carrying the raw mmap-read coordinates. Activation is the existing UTF_PUMP_INSTRUMENT=1 env var; no new gate. Findings recorded 2026-05-12: 1 unique tuple in 39,992 events; mmap is fully stale. Cross-process probes further localised the issue to non-root mmap of tmpfs files written by vn_rdwr. See Findings (2026-05-12) above.
  • E2 (cross-tool test): write a small standalone tool that mmap's the state file and read(2)'s the same file simultaneously, comparing the two views. Run it during cursor motion. Verifies whether the mmap path is the staleness source or something upstream. Resolved 2026-05-12 by an xxd probe without needing a dedicated tool. sudo -u _semadraw xxd of /var/run/sema/input/state taken twice with a cursor move in between showed differing bytes at the pointer offsets, while the same user's inputdump --watch saw stale data. read(2) works for _semadraw; mmap does not. The dedicated tool is no longer needed.
  • E3 (analysis and ADR if needed): based on E1 and E2 findings, decide whether the fix is in:
    • semadrawd's mmap usage (e.g., msync(2) hints, re-mmap on detected staleness)
    • inputfs's write path (e.g., vm_object_page_clean after vn_rdwr)
    • the architectural choice itself (consume events from the event ring instead of polling the state mmap, which would also benefit AD-32's busy-spin concern) Promoted 2026-05-12 to its own BACKLOG entry: see AD-35. The decision is substantial enough to warrant a dedicated tracker; E1 and E2 findings give the foundation it needs.

The E3 decision likely warrants a dedicated ADR (0008+) if the fix is structural rather than tactical.

Estimate

Medium. E1 is small (mirrors Round 1's instrumentation shape). E2 is small (a focused diagnostic tool, ~100 lines of Zig). E3 depends on findings: could be a one-line msync addition, could be a substantial event-ring consumer rewrite. A dedicated ADR may be needed depending on which direction E3 points.

Related

  • AD-25: the umbrella tracker for cursor smoothness. AD-25 cannot close until AD-34's question is resolved.
  • ADR 0007 second addendum (2026-05-12): records the bench finding that surfaced this entry.
  • AD-32: the loop busy-spin concern. If E3 selects the "consume events from the ring" direction, AD-32's fix may fall out naturally (the ring is pollable; the loop could sleep instead of spin).

[ ] AD-36: replace pumpCursorPosition's state-region poll with event consumption (Open, Small; surfaced 2026-05-12; bench-blocked on AD-43 as of 2026-05-27)

ADR 0008's Direction 2 implementation. The cursor pump currently reads pointer position via the state-region mmap (StateReader.pointerSnapshot) on every loop iteration; AD-34 established that this mmap is frozen for _semadraw. The inputfs event ring works as _semadraw (Probe 5 in docs/FREEBSD_ISSUES.md) and already flows through the compositor's input backend.

Observation

The work is smaller than ADR 0008 first framed it. The event-ring consumption path is already wired into semadrawd:

  • semadraw/src/backend/inputfs_input.zig InputfsInput drains the ring inside the drawfs backend's pollEventsImpl.
  • semadraw/src/backend/drawfs.zig getInputfsEventsImpl (line 1319) snapshots a raw input.Event slice of everything drained since the previous call. Added by AD-2a Phase 2.4.2 specifically so future consumers can read the unfiltered stream.
  • semadraw/src/backend/backend.zig exposes this via Backend.getInputfsEvents() (line 320).
  • semadraw/src/daemon/semadrawd.zig (line 1086) already calls self.comp.getInputfsEvents() on every loop iteration to feed the gesture recogniser.

The pump can read from the slice that is already in semadrawd's hand. No new EventRingReader, no new fd, no new poll, no new bootstrap concern.

Scope of change

  • pumpCursorPosition (semadrawd.zig:645) loses its state_reader dependency for the position-tracking purpose. The function gains a new input: the most recent pointer.motion event seen this iteration, if any. Source: the slice from self.comp.getInputfsEvents().
  • The pump's main-loop site already has the slice (the gesture-recogniser call site at line 1086). Either: (a) call the pump after the slice is in hand and pass it through, or (b) move the scan-for-latest-motion inside pumpCursorPosition after the slice is available. (a) is structurally cleaner; (b) is a smaller diff. Implementation chooses based on what the code looks like at the time.
  • The StateReader field on Daemon is not deleted (other state-region consumers may remain, e.g., focus-tracking and smoothing-parameter reads). It just stops being used for cursor position. The pump's state_reader == null lazy-open path becomes unused for this caller and can be left in place or removed based on whether anything else still depends on it.
  • Bootstrap: if no pointer.motion event has been seen since the daemon started, the pump returns without updating position, exactly as it does today when pointerSnapshot() returns null. The first cursor motion seeds last_cursor_pos_*. No special initial drain is needed; the existing InputfsInput.init() code already skips to current writer_seq at startup so the pump does not get a historical replay.
  • Overrun: InputfsInput.drainOnce already handles overrun and resets last_consumed. The pump sees a gap (events lost) but the next motion event seeds correctly; the visible effect is a single cursor jump rather than a smooth tween over the lost frames. Acceptable; matches the behaviour of any input system under sustained drop.

Closure criterion

Bench: semadraw-term --fullscreen running, cursor moves across the screen, visual update is smooth and tracks the pointer continuously. The pump_diagnostic event (AD-38 below) shows non-stale ps_x, ps_y values. AD-25 closes contingent on this.

Implementation landed 2026-05-12 (this commit); bench verification pending. semadrawd.zig adds three Daemon fields (last_motion_x, last_motion_y, last_motion_seen); the main-loop inputfs_events scan harvests pointer.motion x/y into them inside the existing loop that feeds the gesture recogniser; pumpCursorPosition reads from those fields instead of opening a StateReader. The state_reader field is left in place but no longer assigned (its docstring records why). Resolves AD-25 in principle; bench confirms.

Risks

  • Coupling to the gesture-recogniser call site. If the gesture recogniser's call to getInputfsEvents() drains the slice before the pump runs, the pump sees nothing. Mitigation: order the pump call before the gesture recogniser, or have both consume a single snapshot taken once per iteration. The current code structure has only one call site for getInputfsEvents so this is straightforward, but the ordering needs to be made explicit.
  • Other state-region consumers stay on the broken path. AD-34's bug affects any state-region mmap by _semadraw. Today the pump is the only frequent consumer; focus tracking and smoothing parameters read infrequently and have not been observed to misbehave. If a new bug surfaces in any of those consumers, AD-34's underlying cause should be the first hypothesis.

Related

  • ADR 0008: the design decision that schedules this work.
  • AD-25: cursor smoothness umbrella; closes when this lands.
  • AD-34: the underlying FreeBSD-side bug; stays open as a kernel-investigation track.
  • AD-38: instrumentation refresh; lands with or near this.

Update 2026-05-27

AD-36's code change (replace state-region pump with event-ring harvest) is already in place and has been for some time. The closure criterion ("state_valid:true pump_diagnostic events under cursor motion") could not be demonstrated on bench because of three compounding issues, peeled apart through the day:

  • AD-41 (no wake source for inputfs events): closed 2026-05-27 morning. semadrawd's main loop now wakes on /dev/inputfs_notify instead of waiting for the 100 ms timeout.
  • AD-43.1 (per-pixel fillRect dominates CPU): closed 2026-05-27 afternoon. fillRect now uses row-major @memset over [*]u32 for the common no-blend cases. lldb sampling confirms the per-pixel writePixel loop is no longer hot.
  • AD-43.3b (full-screen 4K blit per composite cycle): active as of 2026-05-27 evening. Even with AD-43.1's fast path on the inner fill, each composite still copies a full 3840 x 2160 RGBA surface (33 MB) via the BLIT_TO_EFIFB ioctl. At ~45 fps that costs most of one CPU. Subrect blit (passing the damage rect's bounding box instead of the full surface) reduces this cost proportionally.

AD-43.3a (compose-gate audit) was originally filed as a parallel blocker on the hypothesis that damage_tracker/scheduler was spinning unnecessarily. The gate instrumentation that landed in commit 3991931 showed the hypothesis was wrong about the current system; AD-43.3a is now open but blocked on reproduction. See AD-43's 2026-05-27 evening update for the detailed reasoning.

The AD-36 closure criterion remains the same; the bench verification is now contingent on AD-43.3b. AD-36 stays Open as a documentation-state matter; its underlying code-state work is complete.

[ ] AD-37: investigate why /dev/draw stays continuously readable in poll (Open, Small-Medium; surfaced 2026-05-12)

ADR 0008 claimed Direction 2 would close AD-32 (semadrawd main loop busy-spin) "as a side effect" by sleeping on /dev/draw. Pre-implementation reading of the main loop shows the claim was optimistic: /dev/draw is already in the poll set (semadrawd.zig:1029, backend.zig getPollFd), and posix.poll is already called with that fd as a wake source. The 67 kHz busy-spin observed under AD-25 Round 1 happened anyway, because the fd shows readable continuously while inputfs is active (per the AD-32 entry's Observation section).

The work to actually close AD-32 is therefore an investigation, not an implementation. Why does the fd stay readable, and what would make poll block when no new event has arrived?

Possible causes (to investigate, not yet ruled out)

  • Level-triggered semantics. /dev/draw may signal readable as long as any unconsumed data exists on the backend. If pollEvents() does not drain to the bottom of the kernel-side queue, residual data keeps the fd readable forever. Fix: drain to empty per iteration, or use edge-triggered semantics if available.
  • Spurious wakeups. The drawfs character-device driver may signal readable on internal events not relevant to userland (e.g., bookkeeping in the kernel driver). Fix: filter the readable signal to only fire on user-visible event arrivals.
  • Multiple producers. inputfs publishes to /dev/draw via DRAWFSGIOC_INJECT_INPUT (legacy path) and via the inputfs-event drain at the same time. If both keep the queue non-empty in alternation, the fd never quiesces. Fix: identify which producer is responsible and gate it.
  • Polling at the wrong layer. AD-32 may be best fixed by sleeping on the ring's writer_seq advancing rather than on /dev/draw at all. The ring is mmap'd; sleeping on a memory location requires umtx or a futex primitive; not obviously cheaper than poll. This is the architecturally most invasive option.

Scope of investigation

Discovery-driven, similar to AD-25's earlier rounds. Likely sequence:

  • E1 (instrumentation): log the revents bitmask returned by posix.poll, the count of events drained by pollEvents, and getInputfsEvents().len for each main-loop iteration. UTF_LOOP_INSTRUMENT=1 gate.
  • E2 (bench): run for ~10 seconds with cursor motion, then with no input. Compare which fd is keeping the loop hot.
  • Outcome dictates whether the fix is a one-line drain correction, a kernel-side filter in drawfs, or a more substantial wake-source rework.

Closure criterion

The main loop sleeps in posix.poll when no input is arriving and wakes within one inputfs sync interval (1 ms at INPUTFS_SYNC_HZ=1000) when input arrives. AD-32 closes when this is bench-verified.

Risks

  • AD-36 lands without AD-37. This is fine. AD-36 fixes cursor smoothness; AD-37 fixes CPU/power on idle. They are orthogonal concerns the ADR mistakenly coupled. AD-37 can proceed at its own priority.

Related

  • AD-32: the busy-spin entry that this investigation resolves.
  • ADR 0008: the source of the original (optimistic) "side effect" framing; this entry corrects it.
  • AD-36: the cursor pump fix; independent.

[ ] AD-38: refresh pump and composite-gate instrumentation for the new code path (Open, Small; surfaced 2026-05-12)

The AD-25 Round 1 pump_diagnostic event and the AD-34 E1 extension (ps_x, ps_y) emit from inside pumpCursorPosition at the point where the state-region mmap is read. AD-36 removes that read; the diagnostic emit site must follow the new shape.

Scope of change

  • The pump_diagnostic event payload schema stays the same: surface_id, deltas, last_cursor_pos_*, ps_x, ps_y. After AD-36, ps_x and ps_y carry the absolute coordinates from the latest consumed pointer.motion event payload (rather than from a state-region mmap read).
  • The emit gate stays UTF_PUMP_INSTRUMENT=1. No new env var.
  • The emit site moves from the state-region read in the old pump body to the post-event-scan path in the new pump body. If no pointer.motion event was seen this iteration, no pump_diagnostic event is emitted (no position change means nothing to diagnose); this matches the existing behaviour in the pointerSnapshot()-returned-null branch.
  • composite_gate_diagnostic (AD-25 Round 2) does not change. It fires from needsComposite, which is unaffected by AD-36.

Closure criterion

UTF_PUMP_INSTRUMENT=1 semadrawd produces pump_diagnostic events that show non-stale ps_x, ps_y under cursor motion, and zero pump_diagnostic emissions during periods of no input. Bench verification overlaps with AD-36's closure bench; one capture validates both.

Related

  • AD-36: the pump refactor; this entry refreshes its instrumentation.
  • AD-25 Round 1: the original pump_diagnostic schema.
  • AD-34 E1: added ps_x, ps_y to the schema.

[ ] AD-40: semadraw-term does not reconnect when semadrawd restarts (Open, Medium; surfaced 2026-05-17; scoping question on whether required)

Tracks: semadraw/src/apps/term/ (no reconnect path exists there as of 2026-05-17), install.sh (the trigger that surfaced it).

Symptom and how it surfaced. Running sudo sh install.sh from inside a semadraw-term session fails every attempt. Diagnosed (see the Phase 2.5 runbook SSH-hazard note, commit caa0e2c and its predecessor): install.sh's pre-install teardown stops semadrawd (stop_service_if_running semadraw semadrawd, SIGKILL fallback after 5s) and also stops utf-supervisor. semadraw-term is a client of semadrawd; when semadrawd dies the compositor is gone and the semadraw-term session collapses, taking the shell running install.sh with it, before the post-install restart block runs. The immediate operational fix is "run install.sh over SSH, not from semadraw-term" and is documented in the runbook. This entry is the underlying robustness gap that fix works around.

The actual gap. semadraw/src/apps/term/ has no reconnect / re-establish logic. If semadrawd disappears and later comes back (install restart, supervised respawn after a crash, manual service semadraw restart), an existing semadraw-term process does not reattach - the session is dead even though a healthy semadrawd is running again. The operator must start a fresh semadraw-term.

What this is NOT (precision against existing work). This is distinct from and not a duplicate of the D-5-area disconnect-robustness work (BACKLOG ~310-365, the 2026-04-23 use-after-free / double-disconnect-race fixes). That work hardened the daemon to survive a client dying: kill semadraw-term, semadrawd stays alive and clean (verified, segfault count flat across pkill -KILL -x semadraw-term runs). AD-40 is the opposite direction: the client surviving the daemon restarting. The existing verification explicitly covers "daemon survives client death"; nothing covers "client recovers from daemon death". The directions are independent; closing one did not address the other.

Scoping question (deliberately left open, not decided here). Whether semadraw-term should reconnect is a genuine design decision, not an obvious yes:

  • For: makes the terminal robust to any semadrawd restart (supervised respawn, upgrade, manual restart), not just the install case; aligns with s6 already auto-respawning semadrawd by default (see the AD-20 finish-script flap policy - a single restart is well under threshold).
  • Against / unresolved: even with client reconnect, install.sh SIGKILLs semadrawd and there is a hard window where the compositor is simply absent; a terminal's on-screen state and scrollback cannot be reconstructed from a new compositor connection without the daemon having persisted per-surface state, which it does not. Reconnect might restore a live connection but not the session's visible contents - partial recovery only.
  • Therefore the requirement is unclear: full session survival across semadrawd restart is a much larger piece (daemon-side surface-state persistence) than connection reconnect alone. This entry records the gap and the question; it does not commit to a design. A decision on scope (connection-reconnect-only vs full-session- persistence vs accept-and-document-the-limitation) should precede any implementation.

Not blocking. Does not block AD-2 (the runbook SSH-over-semadraw-term guidance is the working mitigation) or any current critical-path item. Robustness/quality work for the NDE/semadraw-term area; priority unset pending the scoping decision.

[ ] AD-43: software composite hogs semadrawd main loop on 4K bench (Open, Medium-Large; surfaced 2026-05-27 during AD-41.4; P1, transitively blocks AD-36 and AD-25 closure; AD-43.1 closed 2026-05-27, AD-43.3b now active, AD-43.3a blocked on reproduction)

Surfaced during AD-41.4 bench verification. After AD-41.3 landed (wake source via /dev/inputfs_notify), the bench re-run still showed zero pump_diagnostic events during a 10-second cursor-motion window. The notify fd was open, in the poll set, and would have been wakeable - but the main loop never returned to poll().

Observation

procstat -k -w 1 sampled semadrawd over 10 seconds: all 10 samples showed the thread in <running> state, never parked in kern_poll/seltdwait. The sibling processes (s6-supervise, s6-log) were parked in seltdwait every sample. ps -o pcpu,time confirmed 99.1% CPU sustained, with TIME advancing 1:1 with wall clock (5 seconds of CPU per 5 seconds of wall).

lldb -p $(pgrep -x semadrawd) -o "thread backtrace" sampled three times spaced 2 seconds apart, all three landed in the same userspace stack:

frame #0: drawfs.DrawfsBackend.fillRect at drawfs.zig:1134
frame #1: drawfs.DrawfsBackend.executeChunkCommands
frame #2: drawfs.DrawfsBackend.executeSdcs at drawfs.zig:856
frame #3: drawfs.DrawfsBackend.renderImpl at drawfs.zig:818
frame #4: backend.Backend.render at backend.zig:273
frame #5: compositor.Compositor.composite at compositor.zig:450
frame #6: semadrawd.Daemon.run at semadrawd.zig:1267

One sample caught the innermost pixel write (drawfs.writePixel) with idx=27957156, partway through the framebuffer:

frame #0: drawfs.writePixel(idx=27957156, r=0, g=0, b=0, a=255, blend_mode=0) at drawfs.zig:1491

The framebuffer dimensions are confirmed by dmesg:

info(drawfs_backend): display 1: 3840x2160@60000mHz
info(drawfs_backend): efifb available: 3840x2160 stride=15360 bpp=32

3840 x 2160 x 4 bytes = 33,177,600 bytes per full frame. idx 27,957,156 is row 1820 (27957156 / 15360 ~= 1820), showing fillRect is mid-way through filling the screen to opaque black, one 4-byte pixel at a time, in a Zig software loop.

The implementation at drawfs.zig:1131-1136:

while (px < x1) : (px += 1) {
    const idx = @as(usize, @intCast(py)) * stride
              + @as(usize, @intCast(px)) * 4;
    if (idx + 3 < fb.len) {
        writePixel(fb, idx, cr, cg, cb, ca,
                   self.render_state.blend_mode);
    }
}

The bounds check (idx + 3 < fb.len) runs once per pixel, the writePixel call may dispatch on blend_mode at each call, and there is no vectorisation. Estimated cost: ~5 seconds wall-clock per full-screen fillRect at 4K on the bench's single-core software path.

composite() likely calls fillRect at least once per frame for the background, plus once per surface for fills. With each frame costing seconds, the main loop iterates at ~0.2 Hz under load.

Why this matters

The cursor pump (AD-36) cannot demonstrate its closure criterion (state_valid:true pump_diagnostic events under motion) until the main loop iterates fast enough for the 10-second bench window to observe at least one iteration where pump runs after harvest has seen a motion event. With composite at ~5 seconds per frame, iteration rate is ~0.2 Hz; bench window of 10 seconds sees 2-3 iterations at best. The harvest may not happen to coincide with a motion event in any of them. AD-36's closure is therefore bench-blocked on this.

Transitively, AD-25 (cursor smoothness umbrella) is contingent on AD-36 and so is also blocked.

Beyond the AD-36 chain: at ~0.2 Hz iteration rate, all of semadrawd's responsiveness suffers - keyboard event forwarding latency, click handling, surface damage processing, anything that the main loop owns. The practical user-facing effect on bench is "the compositor appears completely unresponsive during sustained motion". This isn't an AD-36-specific problem; it's a general bench-usability problem on 4K hardware.

Fix paths

Three options, roughly in increasing scope, the way AD-42's were laid out:

(a) Vectorise fillRect via @memset or @as([*]u32, ...)[i] = packed_argb row-fills. When the blend_mode is NONE (opaque source), each row of a fillRect is identical: stride/4 words of the same 32-bit ARGB pixel. @memset on a [*]u32 slice is one instruction per write-buffer cache line on amd64 (rep stos or vectorised). Expected speedup: 30-100x. Does not address the blended case, which still needs per-pixel work, but the bench's "fillRect to opaque black" path is the no-blend case and would benefit most.

(b) Add dirty-rect tracking to the composite path. Recompose only the area that actually changed since the previous frame. Most frames have small damage (cursor moved, single character typed) and would touch ~0.1% of the screen. Larger change; requires Composite to track previous-frame damage state. Independent of (a) - they compound.

(c) Move composite to a GPU path. drawfs has the EFI framebuffer mapping; a Vulkan or OpenGL backend would render to that buffer at hardware speeds. Largest scope; multi-week. The pre-existing UTF/BACKLOG discussion of Vulkan backend lives separately; this entry does not duplicate it.

(c) is the durable answer but is multi-week work and the wrong granularity for this entry. (a) is a half-day fix that would unblock AD-36 and the rest of the AD-25 chain. (b) is the right next step after (a), since (a) alone still re-renders the whole screen every frame.

Sub-tasks

  • AD-43.1 (Done 2026-05-27, commit 3fb055d; follow-up commit a12dc33 fixed the test-only struct literal that blocked the manual test invocation): Implemented (a) for the three blend-mode cases where every pixel resolves to the same destination value: Clear (mode 2), Src (mode 1), and SrcOver (mode 0) with alpha == 255. Per-pixel writePixel loop is replaced with a row-major @memset over a [*]u32 slice via a new fillRectFast helper. The remaining cases (SrcOver alpha < 255, Add, and the misaligned-stride fallback) stay on the per-pixel slow path.

    Three unit tests landed in drawfs.zig:

    • fillRectFast: writes only the rect, leaves surroundings untouched: small buffer, byte-checks that only the intended rect is touched.
    • fillRectFast: pixel layout matches writePixel for Src mode: fills an 8-pixel row two ways (per-pixel writePixel vs fillRectFast with composed u32), byte-compares across a 7-colour palette including alpha != 0xFF. This is the parity test that validates the BGRA-to-u32 endianness assumption.
    • fillRectFast: Clear mode equivalence: validates pixel=0 matches writePixel(blend_mode=2).

    All 12 inline tests (3 new + 4 pre-existing in drawfs.zig

    • 5 in inputfs_input.zig) pass via the manual zig test --dep backend --dep input -Mroot=drawfs.zig invocation documented in build.zig:967-973.

    Bench evidence that the change took effect (2026-05-27 post-install, post-reboot lldb sampling, 10 samples):

    • 1 sample landed in fillRect at line 1163, the early-return at the bottom of the fast-path branch, i.e. the fast path completed and returned.
    • 0 samples landed in the per-pixel writePixel inner loop (vs 3/3 samples deep in writePixel pre-AD-43.1).

    The fillRect bottleneck is resolved. semadrawd is no longer wedged in software pixel-fill for seconds at a time.

  • AD-43.2 (verification) (Ran 2026-05-27; closure criterion not met; AD-43.3 now critical not optional): Re-ran scripts/ad36-bench.sh after AD-43.1 landed and the machine was rebooted to ensure all components were on fresh binaries.

    Closure criterion: at least one state_valid:true pump_diagnostic event during the bench's 10-second window with cursor motion.

    Result: not met. Window pump count 0; pre-bench pump count and post-bench pump count both 33,399 (unchanged). inputdump captured 2,178 pointer.motion events arriving at the inputfs ring during the same window. Log lines written by the daemon during the window: 0 (the entire log was static across the bench window).

    The daemon's uptime at bench start was 118 seconds. Pre-bench pumps emitted: 33,399. That's an average of ~283 pumps/s across the daemon's full life, well above the AD-25 target of 10/s, so the loop can iterate fast when it gets the chance. But for 10 seconds during the bench, it didn't. CPU was 99.1% sustained across the same 10 seconds.

    Stack-sampling via sudo lldb -p $(pgrep -x semadrawd) -o "thread backtrace" repeated 10 times found:

    • 6 / 10 samples: in drawfs.doIoctl called from DrawfsBackend.blitToEfifb at drawfs.zig:675, which invokes the DRAWFSGIOC_BLIT_TO_EFIFB kernel ioctl to copy the rendered surface into the EFI framebuffer. The kernel-side handler at drawfs.c:864-877 copies one row at a time via copyin plus memcpy (15,360 bytes per row at 3840 wide, 2160 rows per full blit). Per-blit cost is on the order of 3 ms in theory; at the composite rate the daemon is achieving, the ioctl is now the dominant CPU consumer.

    • 3 / 10 samples: parked in posix.poll at semadrawd.zig:1105 with timeout=100. The loop does reach poll; this confirms the AD-41.3 wake source is in place and AD-43.1's fast path has removed the pre-AD-43.1 "stuck in fillRect" condition.

    • 1 / 10 samples: in fillRect at drawfs.zig:1163, the return after AD-43.1's fast path completed. Not the per-pixel inner loop. AD-43.1 worked.

    Per-fd state at the same moment (procstat -f):

    • fd 4: /dev/draw (backend cdev)
    • fd 5: /var/run/sema/input/events (mmap'd ring)
    • fd 6: /dev/inputfs_notify (AD-41.3 wake source)
    • fd 7: UDS /var/run/semadraw.sock

    dd if=/dev/draw bs=12 count=5 blocked indefinitely when probed, meaning the /dev/draw event queue was empty, ruling out the initial hypothesis that the daemon was poll-spinning on a permanently-readable /dev/draw evq.

    What this shifts in the AD-43 fix-path order: AD-43.3 (dirty-rect tracking / don't recomposite when nothing changed) is no longer "if needed"; it's the next required step. The bench evidence indicates composite is being called continuously even when the only input is cursor motion (or nothing at all), causing back-to-back full-screen blits at whatever rate the frame scheduler allows. Even with AD-43.1 making each fillRect fast, the per-frame work (compositing surfaces, computing damage, blitting to efifb) saturates the CPU and prevents the main loop from reaching pumpCursorPosition at a rate visible to the bench window.

    The two layered improvements that AD-43.3 should now target:

    1. Don't composite when there's no real damage. Investigate why damage_tracker.hasDamage() and scheduler.shouldComposite() (compositor.zig:236) are both returning true on every loop iteration despite no apparent input or surface change. This is the larger win: eliminate the work entirely when no work is needed.

    2. Pass a subrect to DRAWFSGIOC_BLIT_TO_EFIFB instead of the full surface. When composite does have to run, the damage area is typically far smaller than 3840x2160. Passing the actual damage rect through to the blit reduces the ioctl's per-call cost proportionally. This is the smaller win, and can land independently of (1).

  • AD-43.3 (compose / blit reduction) (Open as of 2026-05-27; 3a blocked on reproduction, 3b active): Two layered improvements. AD-43.3b is the active next step; AD-43.3a is blocked on reproducing the spin behaviour. Each can land independently; both compound.

    AD-43.3a (compose-gate audit) (Open, blocked on reproduction; originally filed 2026-05-27 afternoon, cannot proceed as of 2026-05-27 evening): Originally filed on the hypothesis that damage_tracker.hasDamage() and/or scheduler.shouldComposite() was returning true on every loop iteration, driving composite() and BLIT_TO_EFIFB unnecessarily on every iteration of the main loop. The bench evidence at filing time (AD-43.2, daemon session 16d3d9453307a617) was consistent with that hypothesis: 99 percent CPU, 6/10 lldb samples in BLIT_TO_EFIFB, zero log emission for 88 seconds.

    Diagnostic added: enabled UTF_COMPOSITE_GATE_INSTRUMENT in the semadrawd s6 run script (commit 3991931) so the daemon emits a composite_gate_diagnostic event per needsComposite() call, recording the (has_damage, should_composite, state_valid) tuple. The instrumentation works; events.zig:357 produces clean JSON lines that survive s6-log retention.

    Result on bench: the symptom did not reproduce after reboot. A new daemon (session 8e8d6e77f7e9b71d) shows healthy behaviour:

    • has_damage oscillates true/false correctly, approximately tracking surface activity.
    • should_composite oscillates true/false correctly per the frame_scheduler cadence.
    • frame_complete events fire at ~45 fps total across two surfaces (cursor + pgsd-sessiond UI), or ~22 fps per surface.
    • Diagnostic emission rate is ~300 lines/sec (mostly pump_diagnostic + composite_gate_diagnostic pairs). The loop iterates at a sustainable cadence, not the early-life 410K iter/sec burst that filled the log faster than s6-log could drain it.
    • CPU still 99 percent, but this is now legitimate work (software composite of two surfaces at 4K + full-screen BLIT_TO_EFIFB per composite). Not a spin bug.

    Tuple distribution on the healthy daemon (typical 100-line tail):

     4   has_damage:false, should_composite:false
    10   has_damage:true,  should_composite:false
     2   has_damage:true,  should_composite:true
    

    The 2 instances of (true, true) are followed immediately by frame_complete events; the 10 (true, false) reflect damage marked but the scheduler not yet at the next deadline. The (false, false) entries are post-frame quiet periods. This is exactly the intended state machine.

    Conclusion: the AD-43.3a hypothesis was wrong about the current system. Whether it was wrong about the earlier daemon (session 16d3d9453307a617) is undeterminable without reproduction. The session ended (likely killed by install.sh during one of the cycles between bench runs and this evening's investigation), so the state that produced the zero-emission stall cannot be inspected.

    To resume AD-43.3a, a reproduction is required. Two candidate scenarios to try:

    • install.sh + reboot sequence repeated several times, watching for a daemon that locks up shortly after startup.
    • Stress test: kill and relaunch pgsd-sessiond while the daemon is up. Disconnect/reconnect cycles may trip the issue.

    If a reproduction is captured, the existing UTF_COMPOSITE_GATE_INSTRUMENT instrumentation will record the tuple distribution during the stall, which is the diagnostic data AD-43.3a was originally trying to obtain.

    Estimated effort once reproduced: a day or two, depending on what the data shows.

    AD-43.3b (subrect blit) (Active as of 2026-05-27 evening): When composite runs, pass the damage area's bounding rect through BlitToEfifb rather than the full surface. The kernel handler at drawfs.c:864-877 already accepts dst_x, dst_y, width, height parameters; the userspace caller in drawfs.zig:666-673 currently passes the full surface dimensions, but the damage tracker knows the actual changed region. Plumb damage_rect into blitToEfifb.

    Now the load-bearing next step. Today's bench evidence confirms the composite scheduler is well-behaved; the cost is in the work each composite does. A full-screen 3840 x 2160 RGBA blit copies 33 MB into the EFI framebuffer per call. At ~45 fps that is 1.5 GB/sec of memory bandwidth on the kernel-side blit path alone, which is enough to saturate one CPU on its own. Most composites only need to update small regions (cursor moved a few pixels, a single character was typed); a subrect blit reduces the cost proportionally.

    Two implementation notes from today's reading of the code:

    • The kernel side at drawfs.c:864-877 mallocs a per-row temp buffer (stride bytes), copyins one row at a time, memcpys into the efifb mapping, and frees the buffer at the end. The temp-buffer cost is fixed; only the loop body scales with rows. So subrect blit reduces both the copyin work and the memcpy work proportional to dst_height / surface_height.

    • The userspace caller at drawfs.zig:666-673 has the request struct already, so adding dst_x / dst_y / width / height parameters requires changing only the call site, not the ABI between userspace and kernel.

    Estimated effort: a couple of hours, including bench measurement of the CPU delta.

    Closure: when AD-43.3b lands and the bench window observes at least one state_valid:true pump_diagnostic event under motion. That closes AD-43.2's deferred criterion and transitively closes AD-36 and AD-25. AD-43.3a's closure is no longer on the critical path for AD-36 closure; it stays open as a latent issue awaiting reproduction.

  • AD-43.4 (GPU backend, separate): Not scoped under this entry; the existing Vulkan backend discussion is the durable home for (c).

Relationships

  • Surfaced by AD-41.4 (the AD-41 bench verification). AD-41.3's wake source works; AD-43 is a separate, independent cause of the same user-visible symptom.
  • Blocks AD-36 closure. AD-36's code change is in place but cannot be bench-verified at 0.2 Hz iteration rate.
  • Blocks AD-25 closure via AD-36.
  • Affects AD-32 / AD-37 (busy-spin investigation). The lldb stack identifies the spin site precisely (composite -> render -> fillRect -> writePixel) which is much more actionable than the previous bench data was. AD-32 / AD-37 should update their framing to point at AD-43; the busy spin observed on bench was always primarily this, not whatever AD-32 / AD-37 originally hypothesised.

Filing trace

  • 2026-05-27 morning: surfaced during AD-41.4 bench verification. lldb userspace backtrace, procstat <running> stack samples, and dmesg framebuffer geometry captured in the filing.
  • 2026-05-27 afternoon: AD-43.1 landed (commit 3fb055d; follow-up a12dc33 for the test-only struct literal). fillRect fast-path verified by unit tests (3 new tests, all 12 inline tests pass) and by lldb sampling (1/10 samples in fillRect tail, 0/10 in writePixel inner loop, vs 3/3 deep in writePixel pre-AD-43.1).
  • 2026-05-27 afternoon: AD-43.2 bench run. The fillRect bottleneck moved; the new dominant cost is DRAWFSGIOC_BLIT_TO_EFIFB (6/10 lldb samples) plus the underlying question of why composite is being called continuously. AD-43.3 was sketched as "if needed"; the bench evidence makes it required. AD-43.3 was refined to enumerate AD-43.3a (compose-gate audit) and AD-43.3b (subrect blit) as the two layered next steps.
  • 2026-05-27 evening: AD-43.3a's gate instrumentation landed (commit 3991931, UTF_COMPOSITE_GATE_INSTRUMENT enabled in the s6 run script). Bench captures after reboot show the symptom did not reproduce; the new daemon session is healthy with the scheduler oscillating correctly and frame_complete events firing at ~45 fps total. AD-43.3a moved to "blocked on reproduction" status; AD-43.3b promoted to active. The AD-43.2 bench's pathological state (zero log emission for 88 seconds despite 99 percent CPU) remains unexplained. See "Update 2026-05-27 evening" below for the detailed reasoning.

Update 2026-05-27

The AD-43 entry now records two distinct bottlenecks that were never visible in the same lldb sampling pass because the first one masked the second:

  • Pre-AD-43.1: fillRect inner loop, per-pixel writePixel calls. 100 percent of CPU went here on the bench. Resolved by AD-43.1.
  • Post-AD-43.1 (AD-43.2 bench, session 16d3d9453307a617): DRAWFSGIOC_BLIT_TO_EFIFB ioctl, daemon at 99 percent CPU, zero log emission for 88 seconds. This was hypothesised to be damage_tracker/scheduler logic spinning unnecessarily, and AD-43.3a was filed to investigate. AD-43.3a's diagnostic instrumentation (commit 3991931) landed but the symptom did not reproduce; see the 2026-05-27 evening update below.

AD-36 / AD-25 remain bench-blocked. After the evening investigation the blocker is now AD-43.3b (subrect blit), not AD-43.3 generally.

The pattern is worth noting for future bench work: a single perf bottleneck removal often exposes the next layer. Three layers of fixing may yet land here before the bench window observes a state_valid:true pump event. The blit-and-compose path is well-understood and the diagnosis is concrete.

Update 2026-05-27 evening

After the gate instrumentation landed, two bench captures were taken with the new daemon (session 8e8d6e77f7e9b71d). The captures showed healthy behaviour:

  • Frame_complete events at ~45 fps total across two surfaces (cursor + pgsd-sessiond UI).
  • composite_gate_diagnostic tuple distribution roughly 25 percent (false, false), 60 percent (true, false), 15 percent (true, true). The (true, true) entries are immediately followed by frame_complete events; the scheduler is correctly gating composite on the deadline.
  • All pump_diagnostic events show state_valid:false because no cursor motion was harvested in the bench window; that is consistent with no input, not a bug.
  • Daemon still at 99 percent CPU. That CPU is now understood as the legitimate cost of two software- composited surfaces at 4K, each producing a full 33 MB BLIT_TO_EFIFB per frame.

The original AD-43.3a hypothesis ("damage_tracker / scheduler is spinning on every iteration") was wrong about the current system. Whether it was wrong about the earlier daemon (session 16d3d9453307a617) cannot be determined because that session ended.

What we did learn from the bench data:

  • The composite + blit work is expensive, not spurious. Even when the scheduler does its job correctly, 45 fps of full-screen 4K software composite saturates one CPU. The fix is not to call composite less often; the fix is to make each composite cheaper.
  • AD-43.3b (subrect blit) addresses exactly that. Reducing each blit's region from 3840 x 2160 to the actual damage rect shrinks the kernel-side copyin / memcpy proportionally. Most composites only need a small damage rect (cursor movement, single character typed, blink toggle); the bandwidth saving compounds across the 45 fps rate.
  • AD-43.3a is parked. The gate instrumentation stays enabled in case the spin recurs; if it does, the composite_gate_diagnostic events captured during the stall will give us the data we wanted today.

Pivot: AD-43.3b is the active next step. AD-43.3a is open but blocked on reproduction. AD-43 header status string updated accordingly.

[ ] AD-44: inputfs kbdmux bridge has no consumer on PGSD post-AD-39 (Open, Small, P3; surfaced 2026-05-27 evening during the "is drawfs replacing vt(4) and efifb?" audit; documentation-only disposition chosen, no code change)

ADR 0019 (2026-05-09) implements an inputfs->kbdmux bridge so vt(4) at ttyv0..ttyvN can receive keystrokes through kbdmux when the only HID input source is inputfs's HID parser. AD-39 (2026-05-13/14) compiled vt, vt_vga, vt_efifb, vt_vbefb, sc, vga, splash out of the PGSD kernel config. kbdmux itself is retained, but the bridge's intended consumer (vt(4)) is no longer in the PGSD kernel.

So on PGSD kernels, the bridge publishes scancodes into a kbdmux that has no reader. ADR 0019's "Post-AD-39 disposition" addendum (added 2026-05-27 evening) records the detailed behaviour; the short summary is:

  • The bridge code path (inputfs_kbd_intr_cb, inputfs_kbd_emit_at, taskqueue enqueue, notify_task Giant acquire, kbdmux KBDIO_KEYINPUT callback) runs on every keystroke.
  • kbdmux holds the scancodes in the bridge's 1024-entry ring buffer until full, then silently drops further keys (inputfs_kbd_put_key line 441).
  • No /dev/kbdmux0 reader pulls from the ring, because there is no vt(4) to do so.

Cost analysis

Memory: fixed ~4 KB per bridge softc instance (one ring per bridge unit, typically 1 unit).

CPU: per-keystroke overhead, dominated by the Giant acquire/release in inputfs_kbd_notify_task and the kbdmux-layer callback work. At normal typing rates (a few dozen keys/sec) this is invisible. At pathological rates (keyboard stuck in repeat) it could matter but should still not be visible against the rest of inputfs's HID-path cost. No bench measurement of this cost has been made; the above is an estimate from code reading.

Surface area: the bridge is code that runs on every keystroke in spin-mutex context with no consumer, in a kernel where panic is invisible on the framebuffer (per AD-39). A bug inside the bridge path would be hard to diagnose. This is the more interesting concern than CPU cost.

Options considered

A. Document only (chosen). Update ADR 0019 with the post-AD-39 disposition (done in commit landing this entry). Update AD-39's "what is retained" sentence to acknowledge the consumerless situation (done). File this AD-44 entry so the question stays visible. No code change.

B. Default-off on PGSD. Change the SYSCTL default for hw.inputfs.kbdmux_bridge from 1 to 0. Bridge becomes opt-in. Anyone running inputfs.ko on a non-PGSD system that has vt(4) in the kernel would lose console-login keystrokes silently until they set the sysctl back to 1. Not chosen: the sysctl default is compile-time, not runtime-conditional on "is vt(4) present?", so the change affects everyone using the same inputfs.ko regardless of kernel.

C. Compile-out on PGSD. Build-time gate (INPUTFS_KBDMUX_BRIDGE) that the PGSD kernel config omits. Zero surface area on PGSD; non-PGSD systems keep the bridge. Most invasive: source-level #ifdefs and a maintenance burden where inputfs.ko behaves differently on PGSD vs other systems at the binary level. Not chosen: the cost-of-doing-nothing is small enough that the compile-out work is premature.

Disposition

Option A (document only). Conservative choice given the audit context ("better track our changes so we do not go off-course"): making a code change based on the surface-area suspicion without bench evidence is the same class of mistake as the ADR 0002 misadventure two days earlier, where direction preceded fact-finding. AD-44 records the question so it stays visible, but does not act on it without observed harm.

What would trigger revisiting

Any of the following moves AD-44 from "document only" toward Option B or Option C:

  • A kernel panic backtrace shows the bridge code path (inputfs_kbd_intr_cb, inputfs_kbd_emit_at, or inputfs_kbd_notify_task) as the panic site or immediate caller.

  • A bench measurement shows the per-keystroke bridge cost is non-negligible (e.g. measurable in pmcstat or showing up in lldb sampling on the inputfs interrupt path).

  • A new non-vt(4) consumer of kbdmux is identified or proposed (e.g. a hypothetical recovery shell). In that case the bridge's purpose continues post-AD-39 and the disposition becomes "keep enabled," not just "document."

  • The complement: a verifiable absence of any plausible future consumer. In that case Option C (compile-out) becomes the cleanest path.

References

  • inputfs/docs/adr/0019-kbdmux-bridge.md, Status section "Post-AD-39 disposition" addendum (the authoritative description of the bridge's post-AD-39 environment).
  • inputfs/sys/dev/inputfs/inputfs_kbdmux.c:139-143 (the hw.inputfs.kbdmux_bridge SYSCTL declaration, default 1).
  • inputfs/sys/dev/inputfs/inputfs_kbdmux.c:429-449 (inputfs_kbd_put_key, the silent-drop-on-full behaviour).
  • inputfs/sys/dev/inputfs/inputfs_kbdmux.c:673-685 (inputfs_kbd_notify_task, the Giant-acquire per-keystroke).
  • BACKLOG.md AD-39 (line 9929), the supersedure of AD-10 via kernel compile-out; the entry's "kbdmux is retained" sentence now forward-references this AD-44.
  • BACKLOG.md AD-10.5 (line 3466), the original closure of the keystroke-handover sub-stage; closed independently of AD-39's framebuffer-ownership change and is not affected by this entry.