Skip to content

Commit de87e70

Browse files
committed
plans: symlink firmware-namei fix verified — update kmodloader Phase 3
+ defer gunion-reroot experiment Symlink trick (cdroot /boot/firmware → /sysroot/boot/firmware) verified working on real hardware (Lenovo iwlwifi-8260: net.wlan.devices populates, wlan0 created, DRM DMC firmware loads). Backported to freebsd-livecd-unionfs commit 3858734 and freebsd-livecd-gunion commit 62ed74c. Plan updates: - freebsd-kmodloader-plan.html: Phase 3 (firmware loading from /boot/firmware/) marked RESOLVED with reference to freebsd-launchd commit 51e1b80. Open-question Q. Firmware loading also marked RESOLVED. - freebsd-livecd-gunion-reroot-plan.html: status changed from EXPERIMENTAL → DEFERRED. TLDR notes the symlink trick won the comparison; pursue only if some future need genuinely requires kernel-and-userspace-share-a-root semantics. Phase 5 (comparison) marked RESOLVED with the actual comparison table. Open-question Q5 marked RESOLVED. - index.html: pill on freebsd-livecd-gunion-reroot entry changed from warn (EXPERIMENTAL) to info (DEFERRED). Description updated to reflect the simpler fix winning. The reroot architecture remains real engineering and the plan still documents how to do it correctly (loader-preload, busy-cd9660 dodge, phased delivery). It just isn't urgent or even necessary for the problem we originally wrote it for.
1 parent f69a079 commit de87e70

3 files changed

Lines changed: 31 additions & 30 deletions

File tree

freebsd-kmodloader-plan.html

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -404,15 +404,14 @@ <h3>Phase 2 &mdash; hot-plug via /dev/devctl <span class="pill info">PLANNED</sp
404404
</div>
405405

406406
<div class="phase todo">
407-
<h3>Phase 3 &mdash; firmware loading from /boot/firmware/ <span class="pill info">PLANNED</span></h3>
407+
<h3>Phase 3 &mdash; firmware loading from /boot/firmware/ <span class="pill ok">RESOLVED</span></h3>
408408
<p>Bench testing surfaced a real problem: drm-kmod's GPU drivers and the iwlwifi LinuxKPI port load firmware via <code>firmware_get()</code> + <code>try_binary_file()</code> in <code>sys/kern/subr_firmware.c</code>. The function <code>vn_open</code>s <code>/boot/firmware/&lt;name&gt;</code>, but in our chroot setup the kernel context's namei resolves <code>/boot/firmware/</code> against the cd9660 root (loader-only files) instead of the unionfs upper layer (where pkg-installed firmware lives). Result: i915kms loads but DMC firmware fails with "could not load binary firmware"; iwlwifi reports "File size way too small!" on a 2.4 MB blob.</p>
409-
<p>Phase 3 work: fix the live-ISO path so <code>/boot/firmware/</code> in the kernel's vfs view sees the same files userspace sees. Likely candidates:</p>
410-
<ul>
411-
<li>Copy firmware files into the cd9660 layer at build time so kernel namei finds them on the read-only path.</li>
412-
<li>Or reshape the chroot/unionfs setup so kernel-context paths resolve through the union.</li>
413-
<li>Or pre-load all firmware-as-kld wrapper modules at boot via loader.conf so <code>firmware_get()</code> hits the registered-kld path before the file-based fallback.</li>
414-
</ul>
415-
<p>Out of scope for kmodloader proper; tracked here because it's the next blocker for fully-automatic GPU + WiFi support on the live ISO.</p>
409+
410+
<p><strong>Resolved by the symlink trick</strong> in <a href="https://github.com/pkgdemon/freebsd-launchd/commit/51e1b80">freebsd-launchd commit 51e1b80</a>: <code>build.sh</code> creates a symlink at <code>cdroot/boot/firmware → /sysroot/boot/firmware</code>. Kernel namei follows the symlink across the cd9660-to-unionfs mount-point boundary into the unionfs view, finds the file, firmware loads. Verified empirically on a Lenovo with iwlwifi-8260: <code>net.wlan.devices</code> populates, <code>wlan0</code> is created, DRM DMC firmware loads.</p>
411+
412+
<p>The symlink fix is overlay-mechanism-agnostic — works equally on unionfs and gunion. Backported to <a href="https://github.com/pkgdemon/freebsd-livecd-unionfs">freebsd-livecd-unionfs</a> and <a href="https://github.com/pkgdemon/freebsd-livecd-gunion">freebsd-livecd-gunion</a>.</p>
413+
414+
<p>For the architectural alternative (gunion + <code>reboot -r</code> reroot, which would eliminate the chroot/namei split entirely instead of working around it), see the <a href="freebsd-livecd-gunion-reroot-plan.html">freebsd-livecd-gunion-reroot experimental plan</a>. Lower priority now that the symlink trick works.</p>
416415
</div>
417416

418417
<div class="phase todo">
@@ -450,7 +449,7 @@ <h2 id="open">9. Open questions</h2>
450449
</div>
451450

452451
<div class="open-q">
453-
<strong>Q. Firmware loading in live-ISO setup.</strong> See Phase 3. The kernel firmware loader's <code>try_binary_file()</code> reads from <code>/boot/firmware/</code> in kernel-thread context, which doesn't follow the chroot. On a normal FreeBSD install this is fine; on our unionfs+chroot live-ISO it's a real bug. Workarounds known; clean fix pending.
452+
<strong>Q. Firmware loading in live-ISO setup. — RESOLVED.</strong> The kernel firmware loader's <code>try_binary_file()</code> reads from <code>/boot/firmware/</code> in kernel-thread context, which doesn't follow the chroot. Fixed by symlinking <code>/boot/firmware</code> on the cd9660 layer to <code>/sysroot/boot/firmware</code>: kernel namei follows the symlink across the unionfs mount-point boundary into the right view. See Phase 3 above.
454453
</div>
455454

456455
<div class="open-q">

freebsd-livecd-gunion-reroot-plan.html

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -86,13 +86,15 @@
8686
<a href="#refs">References</a>
8787
</nav>
8888

89-
<h1>freebsd-livecd-gunion-reroot — experimental architecture <span class="pill warn">EXPERIMENTAL</span></h1>
90-
<p class="subtitle">A research-grade attempt to fix the kernel-vs-userspace namespace split that today blocks kernel-context firmware loading on all our livecd variants. Combines <code>gunion</code> (block-level RAM overlay) with FreeBSD's <code>reboot -r</code> reroot mechanism, sidestepping the chroot pattern entirely. Intended as a separate sibling repo (<code>freebsd-livecd-gunion-reroot</code>) so we can iterate without disturbing the working <a href="freebsd-livecd-plan.html">freebsd-livecd</a> projects.</p>
89+
<h1>freebsd-livecd-gunion-reroot — experimental architecture <span class="pill info">DEFERRED</span></h1>
90+
<p class="subtitle">A research-grade attempt to fix the kernel-vs-userspace namespace split via <code>gunion</code> (block-level RAM overlay) + <code>reboot -r</code> reroot. <strong>The simpler symlink trick (cdroot <code>/boot/firmware</code><code>/sysroot/boot/firmware</code>) verified working on real hardware, which addresses the immediate firmware-loading problem without an architectural rewrite.</strong> This experiment is preserved as a reference for the alternative architecture; pursue if the symlink approach starts hitting limitations or if a future feature requires kernel and userspace to genuinely share a root.</p>
9191

9292
<section id="tldr" class="tldr">
9393
<h3>TL;DR</h3>
9494
<ul>
95-
<li><strong>Status:</strong> not yet built. This document plans the experiment. Sibling repos <a href="https://github.com/pkgdemon/freebsd-livecd-gunion">freebsd-livecd-gunion</a> and <a href="https://github.com/pkgdemon/freebsd-livecd-unionfs">freebsd-livecd-unionfs</a> are working today using a chroot-based pattern; this is a clean-sheet re-architecture attempt that does NOT touch them.</li>
95+
<li><strong>Status:</strong> <span class="pill info">DEFERRED</span>. The simpler <strong>symlink trick</strong> (<code>cdroot/boot/firmware → /sysroot/boot/firmware</code>) was implemented and <strong>verified working on real hardware</strong> in <a href="https://github.com/pkgdemon/freebsd-launchd/commit/51e1b80">freebsd-launchd commit 51e1b80</a> (Lenovo iwlwifi-8260: <code>net.wlan.devices</code> populates, wlan0 created, DMC firmware loads). Backported to both <a href="https://github.com/pkgdemon/freebsd-livecd-unionfs">freebsd-livecd-unionfs</a> and <a href="https://github.com/pkgdemon/freebsd-livecd-gunion">freebsd-livecd-gunion</a>. This document is preserved as the alternative-architecture reference; build-out is no longer urgent.</li>
96+
<li><strong>When this might still be worth doing:</strong> if a future feature genuinely requires kernel and userspace to share a single root namespace (not just for firmware, but for kld auto-search paths, kernel-context vfs operations the symlink can't redirect, etc.). Or if we hit a wall with the symlink approach in some path we haven't anticipated. Otherwise the symlink fix is sufficient and this experiment stays on the shelf.</li>
97+
<li><strong>Original premise (still accurate as architecture analysis):</strong> sibling repos use chroot for the userspace handoff, so kernel root stays cd9660. <code>reboot -r</code> would actually move the kernel's root to the gunion-backed UFS, eliminating the namespace split. The symlink trick achieves the same end goal (kernel can resolve <code>/boot/firmware/foo</code>) without changing the boot architecture — just by exploiting that kernel namei follows symlinks across mount points.</li>
9698
<li><strong>The problem:</strong> our existing livecds boot from cd9660, mount rootfs.uzip via mdconfig, overlay it (unionfs or gunion), and chroot into the overlay. The kernel's root stays as cd9660 forever. Userspace and kernel see different views of <code>/boot/firmware/</code>. Result: when iwlwifi/i915kms etc. ask for firmware via kernel-context <code>vn_open</code>, the file is invisible. Wifi doesn't come up; GPU acceleration is degraded.</li>
9799
<li><strong>The proposed fix:</strong> use <code>reboot -r</code> after early init.sh setup, so the kernel actually re-mounts root from the gunion-backed UFS. Now the kernel and userspace see the same root. Kernel-context firmware loading works.</li>
98100
<li><strong>The catch:</strong> <code>kern_reroot()</code> calls <code>vfs_unmountall()</code> before re-mounting root. cd9660 has the rootfs.uzip vnode held busy by mdconfig — unmount fails. Either the system gets stuck half-rerooted or, worse, panics in <code>vfs_mountroot()</code> due to inconsistent root-vnode state.</li>
@@ -333,20 +335,20 @@ <h3>Phase 4 — boot test extension in CI <span class="pill info">PLANNED</span>
333335
</div>
334336

335337
<div class="phase todo">
336-
<h3>Phase 5 — comparison + decision <span class="pill info">PLANNED</span></h3>
337-
<ul>
338-
<li>Compare against the symlink-trick approach in freebsd-launchd (commit 51e1b80). Three axes:
339-
<ul>
340-
<li>Does it work? (Both should — different mechanisms, same end goal.)</li>
341-
<li>RAM cost?</li>
342-
<li>Boot time?</li>
343-
<li>Code complexity?</li>
344-
<li>Future-proofness for similar issues?</li>
345-
</ul>
346-
</li>
347-
<li>If reroot is meaningfully better: plan migration of freebsd-launchd to the same model.</li>
348-
<li>If symlink trick is "good enough": keep this as a reference architecture but don't migrate.</li>
349-
</ul>
338+
<h3>Phase 5 — comparison + decision <span class="pill ok">RESOLVED — symlink wins</span></h3>
339+
<p>Comparison done before reroot was built. The symlink trick succeeded on real hardware:</p>
340+
<table class="compact">
341+
<thead><tr><th>Axis</th><th>Symlink (shipped)</th><th>gunion + reroot (this plan)</th></tr></thead>
342+
<tbody>
343+
<tr><td>Code complexity</td><td>1 line in build.sh</td><td>Phase 0-4 work (~3-5 days)</td></tr>
344+
<tr><td>RAM floor</td><td>~200MB (unchanged)</td><td>~600-800MB</td></tr>
345+
<tr><td>Boot time</td><td>unchanged</td><td>+10-30s loader preload</td></tr>
346+
<tr><td>Risk of breaking boot path</td><td>essentially zero</td><td>real (half-rerooted state, panic on busy cd9660)</td></tr>
347+
<tr><td>Solves firmware loading</td><td>yes (verified)</td><td>yes (would have, if built)</td></tr>
348+
<tr><td>Solves future kernel-context vfs ops</td><td>only paths under /boot/firmware</td><td>everything (kernel root = userspace root)</td></tr>
349+
</tbody>
350+
</table>
351+
<p>The symlink wins on every axis except "future-proofing for unrelated kernel-context vfs operations" — which we don't currently need, and which hasn't surfaced as a real problem after the firmware case was fixed. Keep this experiment on the shelf in case that future need materializes.</p>
350352
</div>
351353

352354
<h2 id="risks">7. Risks</h2>
@@ -409,8 +411,8 @@ <h2 id="open">9. Open questions</h2>
409411
<strong>Q4. What happens to file descriptors/processes during reroot?</strong> All processes (init included) get killed and restarted. So our shell init script that calls <code>reboot -r</code> will be killed mid-execution. The kernel re-mounts root and execs <code>init_path</code> in the new root. The shell script's "after reboot -r" code never runs. Need to make sure <code>reboot -r</code> is the LAST thing in init.sh.
410412
</div>
411413

412-
<div class="open-q">
413-
<strong>Q5. Symlink trick vs. reroot — which wins?</strong> The symlink trick (<a href="https://github.com/pkgdemon/freebsd-launchd/commit/51e1b80">freebsd-launchd commit 51e1b80</a>) is much smaller and might solve the same problem. Phase 5 of this plan compares the two head-to-head. If symlink wins, this experimental architecture stays as a reference but doesn't get adopted.
414+
<div class="resolved">
415+
<strong>RESOLVED — Q5. Symlink trick vs. reroot.</strong> Symlink wins. Verified working on Lenovo iwlwifi-8260 hardware in <a href="https://github.com/pkgdemon/freebsd-launchd/commit/51e1b80">freebsd-launchd commit 51e1b80</a>; backported to freebsd-livecd-unionfs and freebsd-livecd-gunion. Reroot remains the architectural answer if a future need surfaces, but for the firmware problem specifically the simpler fix is sufficient.
414416
</div>
415417

416418
<div class="open-q">

index.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,8 +106,8 @@ <h2>FreeBSD Research</h2>
106106
<span class="desc">Why a Linux-style squashfs livecd is hard on FreeBSD today: GSoC 2023 kernel work, geom_uzip vs squashfs vs tarfs, gunion vs unionfs, reboot -r vs init_chroot.</span>
107107
</li>
108108
<li>
109-
<a href="freebsd-livecd-gunion-reroot-plan.html?v=20260506">freebsd-livecd-gunion-reroot &mdash; experimental architecture <span class="pill warn">EXPERIMENTAL</span></a>
110-
<span class="desc">Combine gunion (block-level RAM overlay) with <code>reboot -r</code> (kernel actually re-mounts root). Sidesteps the chroot pattern entirely so kernel-context firmware loading works natively. Loader-preloads rootfs.uzip into RAM (mfsBSD style) to avoid the busy-cd9660-vnode panic risk. Planned as a separate sibling repo so we can iterate without disturbing working livecd projects.</span>
109+
<a href="freebsd-livecd-gunion-reroot-plan.html?v=20260507">freebsd-livecd-gunion-reroot &mdash; experimental architecture <span class="pill info">DEFERRED</span></a>
110+
<span class="desc">Combine gunion (block-level RAM overlay) with <code>reboot -r</code> reroot to eliminate the kernel/userspace namespace split. Deferred &mdash; the simpler symlink trick (<code>cdroot/boot/firmware → /sysroot/boot/firmware</code>) verified working on real hardware, addressing the firmware-loading problem without architectural rewrite. Plan preserved as a reference if some future need genuinely requires kernel-and-userspace-share-a-root semantics.</span>
111111
</li>
112112
<li>
113113
<a href="freebsd-launchd-plan.html?v=20260505">FreeBSD launchd &mdash; porting plan</a>

0 commit comments

Comments
 (0)