Skip to content

ci: add npm-token-health workflow#17

Merged
askalf merged 1 commit into
mainfrom
ci/npm-token-health
May 22, 2026
Merged

ci: add npm-token-health workflow#17
askalf merged 1 commit into
mainfrom
ci/npm-token-health

Conversation

@askalf
Copy link
Copy Markdown
Owner

@askalf askalf commented May 22, 2026

Summary

Daily npm whoami check that catches NPM_TOKEN rot before the next release silently fails to publish.

Motivating incident (today, 2026-05-22)

This repo's NPM_TOKEN was dated 2026-04-01. The @askalf/dario token was rotated 2026-05-20 but the rotation wasn't propagated here. The 2026-05-12 release of v3.2.1 (containing the WS-subprotocol crash fix) failed silently at npm publish with HTTP 404 from PUT /@askalf%2Fagent. Anyone running npm i -g @askalf/agent for the next 10 days got stale v3.1.5 — still affected by the very crash this release was fixing — and there was no signal until a manual fleet audit caught it.

What the workflow does

  • 04:17 UTC daily npm whoami --registry=https://registry.npmjs.org against the token in secrets.NPM_TOKEN.
  • On failure: opens a single GitHub issue (de-duped via <!-- agent-npm-token-rot --> marker comment so a stale token doesn't open a fresh issue every day). Issue body has the full rotate-and-rerun recipe.
  • On success: closes any open token-rot issue automatically (re-arms after rotation).
  • workflow_dispatch for manual verification right after a rotation.

Pattern source

Mirrors @askalf/dario's npm-token-health.yml — proven in production. Only differences: agent- marker prefix instead of dario-, and the issue-body recovery recipe references gh run rerun (agent's publish workflow only fires on release: published) instead of dario's dispatch-able fallback.

Test plan

  • Merge
  • Create npm-token-rot label on the repo (referenced by --label in the issue-create step)
  • gh workflow run npm-token-health.yml -R askalf/agent → should pass (token is healthy, just minted today)
  • Optional negative test: temporarily break the secret, dispatch, confirm an issue gets opened with the right marker, restore the secret, dispatch again, confirm the issue auto-closes

Daily npm whoami check against the configured registry; opens (and
de-dupes via marker comment) a GitHub issue if NPM_TOKEN no longer
authenticates, and auto-closes that issue on the first passing run
after rotation.

Motivated by today's discovery that NPM_TOKEN had been rotated 7
weeks ago (2026-04-01 → present) without updating this repo's
secret, which silently stranded v3.1.6 / v3.2.0 / v3.2.1 from
reaching npm. Anyone running 'npm i -g @askalf/agent' got the stale
v3.1.5 — including the WS-subprotocol crash fix the v3.2.1 release
was specifically shipping.

Pattern mirrors the existing @askalf/dario npm-token-health.yml;
issue body includes the gh-secret-set + gh-run-rerun recipe so the
recovery is self-documenting for any future incident.
@askalf askalf merged commit 2a24f30 into main May 22, 2026
3 checks passed
@askalf askalf deleted the ci/npm-token-health branch May 22, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant