feat: upstream drift detector#58
Conversation
Phase 1 of the upstream-drift tracker (see #56 for the policy this implements). Reads upstream/.upstream-sync.json, calls the GitHub compare API for affaan-m/everything-claude-code, and logs whether upstream is ahead of the recorded baseline. Phase 2 will add issue management on a weekly cron; for now the workflow only runs on manual dispatch. What this lands: - scripts/lib/upstream-drift.js — pure functions, JSDoc-typed: parseUpstreamState (strict schema validation) shortSha formatIssueTitle (used in Phase 2; tested now) formatIssueBody (with COMMIT_BODY_LIMIT truncation; tested now) compareUrl - scripts/upstream/check-upstream-drift.js — entry script. Shells out to gh api compare. Tolerates 404 (upstream rename/archive/ deletion) with a warning and exit 0 so weekly cron does not produce red runs that nobody triages. maxBuffer is bumped to 50 MB because compare responses easily exceed Node's default 1 MB when the delta is large. - tests/lib/upstream-drift.test.js — 20 unit tests covering schema validation (good, malformed JSON, bad upstream, bad SHA, bad date, bad notes), pluralization, truncation, URL construction. - tests/run-all.js — registers the new test file. - .github/workflows/upstream-drift.yml — workflow_dispatch only, permissions: contents: read, no issues:write yet (Phase 2 will add it together with the cron trigger and label). Local smoke run against live upstream confirmed: ECC is currently 1517 commits ahead of the 2026-02-09 baseline; script prints the delta and the compare URL cleanly. Lint + tests both green (242/242). Workflow will be scanned by the existing validate-workflow-security.js — uses neither workflow_run nor pull_request_target, so it passes trivially.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds upstream drift detection infrastructure: a pure-functional helper library for state validation and formatting, a CLI script using GitHub API to compare baseline with upstream HEAD (phase-1 log-only), a manual GitHub Actions workflow to execute the check, comprehensive tests validating all helpers, and test runner configuration. ChangesUpstream Drift Detection
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/upstream/check-upstream-drift.js`:
- Around line 19-21: The requires for built-in modules in
scripts/upstream/check-upstream-drift.js use bare names; update them to use
Node.js "node:" specifiers by replacing require('fs') with require('node:fs'),
require('path') with require('node:path'), and require('child_process') with
require('node:child_process') so the symbols fs, path and execSync continue to
work but follow the repo rule for Node 20+ built-ins.
- Around line 56-63: Replace the use of execSync with execFileSync to avoid
shell interpolation: in the block where stdout is assigned (currently calling
execSync with a template string `gh api ${endpoint}`), call execFileSync('gh',
['api', endpoint], ...) instead and preserve the same options (encoding, stdio,
maxBuffer) so behavior and buffering remain unchanged, ensuring the endpoint is
passed as an argv element rather than embedded in a shell string.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a0ce1939-cd02-447f-baae-338a9ff8fbf0
📒 Files selected for processing (5)
.github/workflows/upstream-drift.ymlscripts/lib/upstream-drift.jsscripts/upstream/check-upstream-drift.jstests/lib/upstream-drift.test.jstests/run-all.js
There was a problem hiding this comment.
2 issues found across 5 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/upstream/check-upstream-drift.js">
<violation number="1" location="scripts/upstream/check-upstream-drift.js:56">
P2: `execSync` with a template string spawns a shell, creating a command-injection surface even though the inputs are regex-validated today. Switch to `execFileSync('gh', ['api', endpoint], …)` which bypasses shell parsing entirely and is the standard defense-in-depth approach.</violation>
<violation number="2" location="scripts/upstream/check-upstream-drift.js:66">
P2: Do not treat every compare-API 404 as a tolerated upstream-unreachable case; it can mask invalid baseline refs and silently skip drift detection.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
P2 — execSync command-injection surface (CodeRabbit, cubic):
Replaced execSync with a template string by execFileSync('gh',
['api', endpoint], opts). The endpoint argument now bypasses shell
parsing entirely. While the input is regex-validated upstream, the
defense-in-depth change costs nothing and removes a class of risk
from the audit-tool surface.
P2 — 404 disambiguation (cubic):
The previous code treated every 404 from `gh api` as a tolerable
"upstream unreachable" warning. That silently masks the much more
common failure mode of a malformed `lastSyncedSha` in
upstream/.upstream-sync.json — the compare endpoint also returns
404 when the baseline ref doesn't exist in a repo that does.
Split into two checks:
1. Pre-flight `gh api repos/<upstream>` — 404 here is genuine
rename/archive/deletion. Warn, exit 0, weekly cron stays green.
2. After pre-flight succeeds, compare-endpoint 404 is a hard
error. Print a specific "Check upstream/.upstream-sync.json"
message and exit 1.
ghApi() now surfaces 404 as a typed error (`err.is404 === true`)
so callers can act on it. Added isUpstreamReachable() as the
pre-flight helper.
P2 — node: prefix suggestion (CodeRabbit):
Not applying — repo convention is 23 plain `require('fs')` vs. 1
`require('node:fs')` (see scripts/hooks/run.js). Switching only
the new files would create inconsistency; a repo-wide refactor is
out of scope for this PR. Will reply on the thread.
Local smoke run:
- Happy path: ECC 1518 commits ahead, clean output, exit 0.
- Bad-baseline path: forced lastSyncedSha to 'deadbeef...' x 5,
pre-flight passes, compare returns 404, script prints
"Baseline SHA ... is not reachable" and exits 1.
Lint clean. 242/242 tests pass.
Summary
Phase 1 of the upstream-drift tracker described in
upstream/README.md. Readsupstream/.upstream-sync.json, calls the GitHub compare API foraffaan-m/everything-claude-code, and logs whether upstream is ahead of the recorded baseline.Phase 2 will add issue management (rolling tracking issue) and the weekly cron. For now the workflow is
workflow_dispatchonly — manual smoke first, automation once the issue path is fully tested.What's in this PR
scripts/lib/upstream-drift.jsparseUpstreamState,shortSha,formatIssueTitle,formatIssueBody,compareUrl. Tested in isolation.scripts/upstream/check-upstream-drift.jsgh api compare, tolerates 404 (rename/archive/deletion) withwarn+ exit 0 so cron doesn't produce red runs nobody triages..github/workflows/upstream-drift.ymlworkflow_dispatchonly.permissions: contents: read. Noissues: writeyet — that lands in Phase 2 alongside the cron + label.tests/lib/upstream-drift.test.jsCOMMIT_BODY_LIMIT, URL construction.tests/run-all.jsImplementation notes
gh apinot new npm deps. TheghCLI is preinstalled on GitHub-hosted runners and authenticated viaGITHUB_TOKEN. No PAT needed becauseaffaan-m/everything-claude-codeis public.maxBufferbumped to 50 MB. The compare endpoint returns >1 MB of JSON when the delta is large. DefaultexecSyncbuffer is 1 MB; overflow killed the child withSIGTERM. Local smoke confirms the bumped buffer handles a 1517-commit delta cleanly.parseUpstreamStatethrows on any structural problem (bad SHA format, missing field, wrong type) so the workflow fails loudly rather than silently treating a malformed state file as "no drift."Local smoke run
Out of scope
upstream-synctracking issue.upstream-synclabel in the existing label manifest.upstream-sync-failureissue if the workflow itself errors).upstream/README.mdmatchesupstream/.upstream-sync.json.Test plan
npm run lint— clean (incl. newpreserve-caught-errorrule)npm test— 242/242 (20 new inlib/upstream-drift.test.js)node scripts/upstream/check-upstream-drift.js) — output abovevalidate-workflow-security.js(noworkflow_run, nopull_request_target)workflow_dispatchrun from the Actions tab post-merge to confirm the YAML parses andGITHUB_TOKENis sufficient for the API calls