Skip to content

fix(ci): use nv-gha-runners buildkit mirror to avoid Docker Hub rate limit#966

Merged
pimlock merged 1 commit intomainfrom
jtoelke/os-127-buildkit-mirror-config
Apr 24, 2026
Merged

fix(ci): use nv-gha-runners buildkit mirror to avoid Docker Hub rate limit#966
pimlock merged 1 commit intomainfrom
jtoelke/os-127-buildkit-mirror-config

Conversation

@jtoelke2
Copy link
Copy Markdown
Collaborator

Summary

First dispatch of shadow-docker-build (run 24908165293) hit Docker Hub's unauthenticated pull rate limit on 3/6 jobs — classic "lucky pulls go through, unlucky ones hit the cap" pattern on shared-runner egress.

Per pimlock's pointer (nv-gha-runners best-practices, Use Docker Cache for BuildKit), the fix is one input to docker/setup-buildx-action@v3: buildkitd-config: /etc/buildkit/buildkitd.toml. That TOML is pre-populated on every nv-gha-runner and points Docker Hub traffic at an in-environment mirror.

Related Issue

OS-49 runner migration, Phase 3 / OS-127. Unblocks clean shadow dispatches on PR 964's shadow workflow.

Changes

  • .github/actions/setup-buildx/action.yml: add buildkitd-config: /etc/buildkit/buildkitd.toml to the driver: local branch only. Remote-driver path is unaffected (ARC BuildKit pods live inside EKS and don't egress to Docker Hub).

Testing

  • mise run pre-commit — Rust / Python / license / helm all green. markdown:lint:md fails on 8 pre-existing MD040 errors in architecture/podman-rootless-networking.md from PR Openshell driver podman #904 — unrelated to this PR.
  • Unit tests added/updated — N/A; composite action config.
  • E2E tests — N/A.
  • Dispatch validation — will re-dispatch shadow-docker-build after merge; expect all 6 matrix cells to succeed on image pulls.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated — N/A.

…limit

Signed-off-by: Jonas Toelke <jtoelke@nvidia.com>
@jtoelke2 jtoelke2 requested a review from a team as a code owner April 24, 2026 20:49
@jtoelke2 jtoelke2 self-assigned this Apr 24, 2026
@jtoelke2 jtoelke2 requested a review from pimlock April 24, 2026 20:49
@pimlock pimlock merged commit daa7d7d into main Apr 24, 2026
21 of 22 checks passed
@pimlock pimlock deleted the jtoelke/os-127-buildkit-mirror-config branch April 24, 2026 20:54
jtoelke2 added a commit that referenced this pull request Apr 24, 2026
#966 hard-coded `buildkitd-config: /etc/buildkit/buildkitd.toml` inside
the `driver: local` branch of the setup-buildx composite action. The only
caller using that driver is shadow-docker-build.yml, which runs inside
the ghcr.io/nvidia/openshell/ci:latest container — so the host-side
buildkitd.toml was invisible to docker/setup-buildx-action and every
matrix job failed at "Set up buildx".

Revert the hard-coded path and expose it as an opt-in input on the
action (empty default, passed through to both the remote and local
branches). Wire shadow-docker-build.yml to bind-mount /etc/buildkit
into the ci container and pass the path explicitly, so the action can
read the file from inside the container. Remote-driver callers are
unaffected (empty input is a no-op).

Signed-off-by: Jonas Toelke <jtoelke@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants