Skip to content

feat(sandbox): load system CA certificates for upstream TLS connections#862

Merged
johntmyers merged 1 commit intoNVIDIA:mainfrom
matz3:sandbox/load-system-ca-certs
Apr 16, 2026
Merged

feat(sandbox): load system CA certificates for upstream TLS connections#862
johntmyers merged 1 commit intoNVIDIA:mainfrom
matz3:sandbox/load-system-ca-certs

Conversation

@matz3
Copy link
Copy Markdown
Contributor

@matz3 matz3 commented Apr 16, 2026

Summary

The sandbox proxy's upstream TLS client only trusted Mozilla root CAs (via webpki-roots), which broke TLS connections to internal or corporate hosts that use private CA certificates. This PR loads system CA certificates from the container's trust store (e.g. /etc/ssl/certs/ca-certificates.crt) in addition to Mozilla roots, so custom sandbox images can include corporate CAs via update-ca-certificates.

Related Issue

No dedicated GitHub issue created.
The issue is that connections to internal / corporate hosts are not possible in case they use a custom domain with a non-public root CA.

Changes

  • l7/tls.rs: build_upstream_client_config now accepts a system_ca_bundle string and loads those PEM certs into the rustls root store alongside webpki-roots
  • l7/tls.rs: New load_pem_certs_into_store helper that parses a PEM bundle into a RootCertStore, returning (added, ignored) counts
  • l7/tls.rs: read_system_ca_bundle promoted to pub; write_ca_files now takes the pre-read bundle as a parameter to avoid reading it twice
  • lib.rs: run_sandbox reads the system CA bundle once and passes it to both write_ca_files and build_upstream_client_config
  • Tests: 7 new unit tests covering single/multiple/empty/malformed PEM parsing and write_ca_files output
  • Architecture docs: Updated sandbox.md and gateway-security.md to reflect the new trust chain behavior

Testing

  • mise run pre-commit passes

  • Unit tests added/updated

  • E2E tests not applicable (according to principal-engineer-reviewer agent)

  • Manual testing via custom image

    • Example Dockerfile:
      FROM ghcr.io/nvidia/openshell-community/sandboxes/base:latest
      
      USER root
      
      # Copy your corporate/internal CA certificate
      COPY my-corporate-ca.crt /usr/local/share/ca-certificates/my-corporate-ca.crt
      
      # Rebuild the system trust store
      RUN /usr/sbin/update-ca-certificates
      
      USER sandbox
  • Security validation (via Claude Code / Opus 4.6)

    • The system CA bundle is read once at startup in the supervisor process, before the sandboxed child exists. The child cannot write to the CA paths due to Landlock (read-only) and DAC (unprivileged user). On restart, the same protections apply. The trust boundary is the container image builder, which is appropriate.

Checklist

The proxy's upstream TLS client only trusted Mozilla root CAs
(webpki-roots), which prevented TLS termination from working with
internal/corporate hosts using private CA certificates.

Load system CA certificates from the container's trust store
(e.g. /etc/ssl/certs/ca-certificates.crt) in addition to
webpki-roots. This allows custom sandbox images to include
corporate CAs via update-ca-certificates.

Signed-off-by: Matthias Osswald <mat.osswald@sap.com>
@matz3 matz3 requested a review from a team as a code owner April 16, 2026 07:33
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 16, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 16, 2026

All contributors have signed the DCO ✍️ ✅
Posted by the DCO Assistant Lite bot.

@matz3
Copy link
Copy Markdown
Contributor Author

matz3 commented Apr 16, 2026

I have read the DCO document and I hereby sign the DCO.

@johntmyers
Copy link
Copy Markdown
Collaborator

recheck

@johntmyers
Copy link
Copy Markdown
Collaborator

Hi @matz3 thank you. Is this good to merge?

@matz3
Copy link
Copy Markdown
Contributor Author

matz3 commented Apr 16, 2026

Yes 👍🏻
It works fine for me when using a custom sandbox image as mentioned above.

@johntmyers johntmyers merged commit 3b21df1 into NVIDIA:main Apr 16, 2026
10 of 11 checks passed
@matz3 matz3 deleted the sandbox/load-system-ca-certs branch April 17, 2026 06:30
ericksoa pushed a commit to NVIDIA/NemoClaw that referenced this pull request Apr 23, 2026
## Summary
Bumps the pinned OpenShell version range from `0.0.29` → `0.0.32` so
fresh NemoClaw installs pick up sandbox hardening and TLS improvements
from the last three OpenShell releases.

## Notable upstream changes

**0.0.30**
([NVIDIA/OpenShell@v0.0.29...v0.0.30](NVIDIA/OpenShell@v0.0.29...v0.0.30))
- Network policy deny rules
([OpenShell#822](NVIDIA/OpenShell#822))
- Preserve ownership on existing `read_write` paths
([OpenShell#827](NVIDIA/OpenShell#827))
- Disable child core dumps
([OpenShell#821](NVIDIA/OpenShell#821))
- Escape control characters in SSE error formatting
([OpenShell#842](NVIDIA/OpenShell#842))
- Fix silent truncation of large streaming inference responses
([OpenShell#834](NVIDIA/OpenShell#834))

**0.0.31**
([NVIDIA/OpenShell@v0.0.30...v0.0.31](NVIDIA/OpenShell@v0.0.30...v0.0.31))
- Inference routed-request header allowlist
([OpenShell#826](NVIDIA/OpenShell#826))

**0.0.32**
([NVIDIA/OpenShell@v0.0.31...v0.0.32](NVIDIA/OpenShell@v0.0.31...v0.0.32))
- **Load system CA certificates for upstream TLS connections**
([OpenShell#862](NVIDIA/OpenShell#862))
- Publish standalone `openshell-gateway` binaries
([OpenShell#853](NVIDIA/OpenShell#853))

## Changes
- `nemoclaw-blueprint/blueprint.yaml`: `min_openshell_version` and
`max_openshell_version` → `0.0.32`
- `scripts/install-openshell.sh`: `MIN_VERSION` and `MAX_VERSION` →
`0.0.32` (`PIN_VERSION` follows `MAX`)
- `scripts/brev-launchable-ci-cpu.sh`: default `OPENSHELL_VERSION` →
`v0.0.32`
- `src/lib/onboard.ts`: blueprint-fallback min version → `0.0.32`
- `test/onboard.test.ts`,
`test/install-openshell-version-check.test.ts`: fixtures updated; "above
MAX" test case moved from `0.0.30` to `0.0.33`

Historical `m-dev` comments referencing `0.0.29` left in place — they
describe a self-report quirk the sidecar fallback still handles.

## Why not 0.0.33+?
`0.0.34` introduced incremental sandbox policy updates and L7
request-target canonicalization — changes with larger surface area
against how NemoClaw delivers policy via gRPC. Worth a follow-up PR
rather than bundling here. `0.0.35` released hours before this PR was
cut — too fresh.

## Type of Change
- [x] Code change for a new feature, bug fix, or refactor.

## Testing
- [x] `npx vitest run test/install-openshell-version-check.test.ts` — 9
passed
- [x] pre-commit hooks (prek) clean: shellcheck, commitlint, gitleaks,
YAML validator, CLI test suite
- [ ] Nightly E2E on this branch — will be kicked off after PR opens

## Notes
- No user-facing CLI behavior changes — just the pinned version range.
- Two pre-existing failures in `test/onboard.test.ts` reproduce on clean
`main` and are unrelated to this bump.

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Updated OpenShell version constraints and default pinned version to
v0.0.32 across configuration, install, and onboarding flows.

* **Tests**
* Updated test fixtures and expectations to match the new OpenShell
version (v0.0.32).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Prekshi Vyas <prekshiv@nvidia.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants