Skip to content

feat: upstream drift detector#58

Merged
Jamkris merged 2 commits into
mainfrom
feat/upstream-drift-tracker
May 12, 2026
Merged

feat: upstream drift detector#58
Jamkris merged 2 commits into
mainfrom
feat/upstream-drift-tracker

Conversation

@Jamkris
Copy link
Copy Markdown
Owner

@Jamkris Jamkris commented May 12, 2026

Summary

Phase 1 of the upstream-drift tracker described in upstream/README.md. Reads upstream/.upstream-sync.json, calls the GitHub compare API for affaan-m/everything-claude-code, and logs whether upstream is ahead of the recorded baseline.

Phase 2 will add issue management (rolling tracking issue) and the weekly cron. For now the workflow is workflow_dispatch only — manual smoke first, automation once the issue path is fully tested.

What's in this PR

File Role
scripts/lib/upstream-drift.js Pure functions (JSDoc-typed): parseUpstreamState, shortSha, formatIssueTitle, formatIssueBody, compareUrl. Tested in isolation.
scripts/upstream/check-upstream-drift.js Entry script. Shells out to gh api compare, tolerates 404 (rename/archive/deletion) with warn + exit 0 so cron doesn't produce red runs nobody triages.
.github/workflows/upstream-drift.yml workflow_dispatch only. permissions: contents: read. No issues: write yet — that lands in Phase 2 alongside the cron + label.
tests/lib/upstream-drift.test.js 20 unit tests: schema validation (malformed JSON, bad upstream, bad SHA, empty date, non-string notes), short-SHA, title pluralization, body truncation at COMMIT_BODY_LIMIT, URL construction.
tests/run-all.js Registers the new test file.

Implementation notes

  • gh api not new npm deps. The gh CLI is preinstalled on GitHub-hosted runners and authenticated via GITHUB_TOKEN. No PAT needed because affaan-m/everything-claude-code is public.
  • maxBuffer bumped to 50 MB. The compare endpoint returns >1 MB of JSON when the delta is large. Default execSync buffer is 1 MB; overflow killed the child with SIGTERM. Local smoke confirms the bumped buffer handles a 1517-commit delta cleanly.
  • Strict state validation. parseUpstreamState throws on any structural problem (bad SHA format, missing field, wrong type) so the workflow fails loudly rather than silently treating a malformed state file as "no drift."
  • 404 is tolerated. If upstream is renamed, archived, or deleted, the script logs the failure and exits 0. The weekly Action shouldn't go red every Monday until the config is fixed.

Local smoke run

[upstream-drift] Upstream: affaan-m/everything-claude-code
[upstream-drift] Status: ahead (ahead_by=1517, behind_by=0)
[upstream-drift] Baseline: 9db9867 (recorded 2026-02-09)
[upstream-drift] Upstream HEAD: e5229ce
[upstream-drift] Compare: https://github.com/affaan-m/everything-claude-code/compare/9db98673d054f5ed0991ba9d67ff4c883c81a42f...e5229cec92f52fc42417a7e9b397ba04c2b2ab1c
[upstream-drift] Drift detected: upstream is 1517 commit(s) ahead.

Out of scope

  • Issue management: list/create/edit/close the rolling upstream-sync tracking issue.
  • Weekly cron trigger (Monday 21:00 UTC = Monday 06:00 KST).
  • upstream-sync label in the existing label manifest.
  • Action-failure tracking (the recursive trick: a separate upstream-sync-failure issue if the workflow itself errors).
  • CI validator that asserts the SHA in upstream/README.md matches upstream/.upstream-sync.json.

Test plan

  • npm run lint — clean (incl. new preserve-caught-error rule)
  • npm test — 242/242 (20 new in lib/upstream-drift.test.js)
  • Local smoke against live upstream (node scripts/upstream/check-upstream-drift.js) — output above
  • Workflow YAML passes existing validate-workflow-security.js (no workflow_run, no pull_request_target)
  • Manual workflow_dispatch run from the Actions tab post-merge to confirm the YAML parses and GITHUB_TOKEN is sufficient for the API calls

Phase 1 of the upstream-drift tracker (see #56 for the policy this
implements). Reads upstream/.upstream-sync.json, calls the GitHub
compare API for affaan-m/everything-claude-code, and logs whether
upstream is ahead of the recorded baseline. Phase 2 will add issue
management on a weekly cron; for now the workflow only runs on
manual dispatch.

What this lands:

- scripts/lib/upstream-drift.js — pure functions, JSDoc-typed:
    parseUpstreamState (strict schema validation)
    shortSha
    formatIssueTitle (used in Phase 2; tested now)
    formatIssueBody (with COMMIT_BODY_LIMIT truncation; tested now)
    compareUrl
- scripts/upstream/check-upstream-drift.js — entry script. Shells
  out to gh api compare. Tolerates 404 (upstream rename/archive/
  deletion) with a warning and exit 0 so weekly cron does not
  produce red runs that nobody triages. maxBuffer is bumped to
  50 MB because compare responses easily exceed Node's default 1 MB
  when the delta is large.
- tests/lib/upstream-drift.test.js — 20 unit tests covering schema
  validation (good, malformed JSON, bad upstream, bad SHA, bad date,
  bad notes), pluralization, truncation, URL construction.
- tests/run-all.js — registers the new test file.
- .github/workflows/upstream-drift.yml — workflow_dispatch only,
  permissions: contents: read, no issues:write yet (Phase 2 will
  add it together with the cron trigger and label).

Local smoke run against live upstream confirmed: ECC is currently
1517 commits ahead of the 2026-02-09 baseline; script prints the
delta and the compare URL cleanly.

Lint + tests both green (242/242). Workflow will be scanned by the
existing validate-workflow-security.js — uses neither workflow_run
nor pull_request_target, so it passes trivially.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Warning

Rate limit exceeded

@Jamkris has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 19 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a2c2ec5b-4ce6-4c48-b364-590a7b0bb016

📥 Commits

Reviewing files that changed from the base of the PR and between 1b637a5 and cf359a5.

📒 Files selected for processing (1)
  • scripts/upstream/check-upstream-drift.js

Walkthrough

Adds upstream drift detection infrastructure: a pure-functional helper library for state validation and formatting, a CLI script using GitHub API to compare baseline with upstream HEAD (phase-1 log-only), a manual GitHub Actions workflow to execute the check, comprehensive tests validating all helpers, and test runner configuration.

Changes

Upstream Drift Detection

Layer / File(s) Summary
Helper Library and State Validation
scripts/lib/upstream-drift.js
Regex validators for commit SHAs and owner/repo identifiers; parseUpstreamState validates JSON structure (upstream, lastSyncedSha, lastSyncedAt, optional notes); shortSha truncates to 7 chars; formatIssueTitle generates titles with commit count; formatIssueBody renders Markdown with baseline/upstream SHAs, compare URL, and commits truncated to COMMIT_BODY_LIMIT; compareUrl builds GitHub compare links.
CLI Entry Script and GitHub API Integration
scripts/upstream/check-upstream-drift.js
Loads state from upstream/.upstream-sync.json, calls gh api to compare baseline with upstream HEAD, logs drift summary (status, baseline SHA/time, upstream HEAD SHA, compare URL). Tolerates 404/Not Found upstream by logging warning and exiting 0; exits 1 only on local state read failures or unexpected API errors. Exports readState, ghApi, summarize for testing.
GitHub Actions Workflow Configuration
.github/workflows/upstream-drift.yml
Manual-dispatch workflow on ubuntu-latest with 5-minute timeout; checks out repo, sets up Node.js 20.x, runs drift check script with GH_TOKEN and contents: read permission.
Comprehensive Test Suite
tests/lib/upstream-drift.test.js
Validates parseUpstreamState error handling (malformed JSON, invalid fields, incorrect SHA length/hex, empty dates); tests shortSha, formatIssueTitle (pluralization, input validation), formatIssueBody (URL inclusion, commit truncation beyond COMMIT_BODY_LIMIT, empty commits, invalid inputs), and compareUrl construction. Simple runner exits with code 1 on failures.
Test Runner Configuration
tests/run-all.js
Added lib/upstream-drift.test.js to testFiles array for discovery.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

security

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: upstream drift detector' clearly describes the main change: adding an upstream drift detection feature with the necessary workflow, library, and test files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/upstream-drift-tracker

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Jamkris Jamkris changed the title feat: upstream drift detector (Phase 1 — log only) feat: upstream drift detector May 12, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/upstream/check-upstream-drift.js`:
- Around line 19-21: The requires for built-in modules in
scripts/upstream/check-upstream-drift.js use bare names; update them to use
Node.js "node:" specifiers by replacing require('fs') with require('node:fs'),
require('path') with require('node:path'), and require('child_process') with
require('node:child_process') so the symbols fs, path and execSync continue to
work but follow the repo rule for Node 20+ built-ins.
- Around line 56-63: Replace the use of execSync with execFileSync to avoid
shell interpolation: in the block where stdout is assigned (currently calling
execSync with a template string `gh api ${endpoint}`), call execFileSync('gh',
['api', endpoint], ...) instead and preserve the same options (encoding, stdio,
maxBuffer) so behavior and buffering remain unchanged, ensuring the endpoint is
passed as an argv element rather than embedded in a shell string.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a0ce1939-cd02-447f-baae-338a9ff8fbf0

📥 Commits

Reviewing files that changed from the base of the PR and between 1c25fb9 and 1b637a5.

📒 Files selected for processing (5)
  • .github/workflows/upstream-drift.yml
  • scripts/lib/upstream-drift.js
  • scripts/upstream/check-upstream-drift.js
  • tests/lib/upstream-drift.test.js
  • tests/run-all.js

Comment thread scripts/upstream/check-upstream-drift.js Outdated
Comment thread scripts/upstream/check-upstream-drift.js Outdated
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="scripts/upstream/check-upstream-drift.js">

<violation number="1" location="scripts/upstream/check-upstream-drift.js:56">
P2: `execSync` with a template string spawns a shell, creating a command-injection surface even though the inputs are regex-validated today. Switch to `execFileSync('gh', ['api', endpoint], …)` which bypasses shell parsing entirely and is the standard defense-in-depth approach.</violation>

<violation number="2" location="scripts/upstream/check-upstream-drift.js:66">
P2: Do not treat every compare-API 404 as a tolerated upstream-unreachable case; it can mask invalid baseline refs and silently skip drift detection.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread scripts/upstream/check-upstream-drift.js Outdated
Comment thread scripts/upstream/check-upstream-drift.js Outdated
P2 — execSync command-injection surface (CodeRabbit, cubic):
  Replaced execSync with a template string by execFileSync('gh',
  ['api', endpoint], opts). The endpoint argument now bypasses shell
  parsing entirely. While the input is regex-validated upstream, the
  defense-in-depth change costs nothing and removes a class of risk
  from the audit-tool surface.

P2 — 404 disambiguation (cubic):
  The previous code treated every 404 from `gh api` as a tolerable
  "upstream unreachable" warning. That silently masks the much more
  common failure mode of a malformed `lastSyncedSha` in
  upstream/.upstream-sync.json — the compare endpoint also returns
  404 when the baseline ref doesn't exist in a repo that does.

  Split into two checks:
  1. Pre-flight `gh api repos/<upstream>` — 404 here is genuine
     rename/archive/deletion. Warn, exit 0, weekly cron stays green.
  2. After pre-flight succeeds, compare-endpoint 404 is a hard
     error. Print a specific "Check upstream/.upstream-sync.json"
     message and exit 1.

  ghApi() now surfaces 404 as a typed error (`err.is404 === true`)
  so callers can act on it. Added isUpstreamReachable() as the
  pre-flight helper.

P2 — node: prefix suggestion (CodeRabbit):
  Not applying — repo convention is 23 plain `require('fs')` vs. 1
  `require('node:fs')` (see scripts/hooks/run.js). Switching only
  the new files would create inconsistency; a repo-wide refactor is
  out of scope for this PR. Will reply on the thread.

Local smoke run:
- Happy path: ECC 1518 commits ahead, clean output, exit 0.
- Bad-baseline path: forced lastSyncedSha to 'deadbeef...' x 5,
  pre-flight passes, compare returns 404, script prints
  "Baseline SHA ... is not reachable" and exits 1.

Lint clean. 242/242 tests pass.
@Jamkris Jamkris merged commit 724341e into main May 12, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant