Skip to content

fix: rotate proof nudge scans#377

Open
brokemac79 wants to merge 4 commits into
openclaw:mainfrom
brokemac79:codex/proof-nudge-rotation
Open

fix: rotate proof nudge scans#377
brokemac79 wants to merge 4 commits into
openclaw:mainfrom
brokemac79:codex/proof-nudge-rotation

Conversation

@brokemac79

@brokemac79 brokemac79 commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

What Problem This Solves

The scheduled proof-nudge lane can spend its whole processed-record budget on the same skip-heavy prefix of report records. When the first candidate window is mostly protected, recently reviewed, already nudged, or otherwise skipped, later stale PRs that still need real behavior proof can wait even though the workflow is running successfully every day.

Why This Change Was Made

This adds durable cursor rotation to the untargeted proof-nudges and bot-proof scans. Each executing untargeted run starts after the previous cursor, processes a bounded candidate window, and advances the cursor to the last processed candidate. Targeted --item-numbers runs stay deterministic and do not read or update cursor state.

The workflow now passes proof-specific cursor paths, publishes state back to clawsweeper-state, and exposes processed_limit / CLAWSWEEPER_PROOF_NUDGES_PROCESSED_LIMIT so scan depth can be tuned separately from the max comments/actions per run. Cursor publishing is gated on the corresponding lane running in execute mode and producing the exact target cursor file, so dry-runs do not publish missing cursor paths and one target repo run does not replace another target repo's cursor file.

User Impact

Scheduled proof handling should move through the proof backlog more predictably instead of repeatedly inspecting the same skipped records. The change does not increase the default comment batch size, does not alter proof eligibility rules, and does not make targeted manual runs rotate unexpectedly.

Evidence

Head proofed: eb444762164a928c45d4635d5861e0a0a3e83ce1.

  • Open PR overlap scan on 2026-06-27 found only this PR for proof nudge cursor and CLAWSWEEPER_PROOF_NUDGES_PROCESSED_LIMIT, and no other open PR for proof nudge rotation.
  • Runtime cursor proof on 2026-06-27 used the built CLI entrypoints with GH_BIN mocked only for safe GitHub read responses. Both lanes ran with --execute and wrote cursor files under the same results/... paths used by the workflow:
command: node dist/clawsweeper.js proof-nudges --target-repo openclaw/openclaw --items-dir <temp>/proof-items --limit 10 --processed-limit 2 --report-path <temp>/proof-nudge-run-1-report.json --cursor-path <temp>/results/proof-nudge-cursors/openclaw-openclaw.json --execute
result: 41:skipped_not_open, 42:skipped_not_open
cursor: lane=proof_nudges next_cursor_number=42 next_cursor_likely=True reviewed_at=2026-01-02T00:00:00Z

command: node dist/clawsweeper.js proof-nudges --target-repo openclaw/openclaw --items-dir <temp>/proof-items --limit 10 --processed-limit 2 --report-path <temp>/proof-nudge-run-2-report.json --cursor-path <temp>/results/proof-nudge-cursors/openclaw-openclaw.json --execute
result: 43:skipped_not_open, 41:skipped_not_open
cursor: lane=proof_nudges next_cursor_number=41 next_cursor_likely=True reviewed_at=2026-01-01T00:00:00Z

command: node dist/clawsweeper.js bot-proof --target-repo openclaw/openclaw --items-dir <temp>/bot-items --limit 10 --processed-limit 2 --report-path <temp>/bot-proof-run-1-report.json --cursor-path <temp>/results/bot-proof-cursors/openclaw-openclaw.json --execute
result: 51:skipped_not_open, 52:skipped_not_open
cursor: lane=bot_proof next_cursor_number=52 next_cursor_likely=True reviewed_at=2026-01-02T00:00:00Z

command: node dist/clawsweeper.js bot-proof --target-repo openclaw/openclaw --items-dir <temp>/bot-items --limit 10 --processed-limit 2 --report-path <temp>/bot-proof-run-2-report.json --cursor-path <temp>/results/bot-proof-cursors/openclaw-openclaw.json --execute
result: 53:skipped_not_open, 51:skipped_not_open
cursor: lane=bot_proof next_cursor_number=51 next_cursor_likely=True reviewed_at=2026-01-01T00:00:00Z
  • Workflow repair after ClawSweeper review publishes exact per-target cursor files: results/proof-nudge-cursors/${target_slug}.json and results/bot-proof-cursors/${target_slug}.json, each only when the corresponding lane executed and wrote that file.
  • Current-head test coverage includes direct execute-path regression tests for bot-proof cursor writes and for targeted proof-lane runs ignoring --cursor-path and leaving cursor files absent.
  • Current-head workflow coverage includes a focused assertion that untargeted cursor setup does not populate cursor_publish_args, and that proof-nudge and bot-proof cursor publish paths are exact target files added only behind execute-mode plus file-exists gates.
  • Docs now state that the workflow publishes only exact target cursor files and does not replace another target repo's cursor file.
  • Rollout monitoring item from the ClawSweeper review: after merge, inspect the first executed scheduled proof-lane state commit and verify it updates only the exact per-target cursor JSON file for the executed lane/target, with no directory-level replacement and no sibling target cursor deletion.
  • pnpm install --frozen-lockfile completed and installed the pinned dev toolchain in the fresh worktree.
  • pnpm exec oxfmt --write .github/workflows/proof-nudges.yml test/sweep-workflow.test.ts docs/proof-nudges.md test/proof-nudge-policy.test.ts src/clawsweeper.ts passed.
  • pnpm exec oxfmt --check .github/workflows/proof-nudges.yml test/sweep-workflow.test.ts docs/proof-nudges.md test/proof-nudge-policy.test.ts src/clawsweeper.ts passed on current head.
  • pnpm run build:all passed on current head.
  • pnpm run lint:src passed on current head.
  • pnpm run lint:scripts passed on current head.
  • node --test test/proof-nudge-policy.test.ts passed on current head: 17 tests, 17 pass.
  • node --test --test-name-pattern "proof nudge workflow" test/sweep-workflow.test.ts passed on current head: 2 tests, 2 pass.
  • pnpm run check was run locally after the exact-file publish repair; it failed on unrelated Windows-checkout issues in existing tests, including CRLF-sensitive workflow assertions, Windows file-mode expectations, and existing command/Codex tests.
  • GitHub pnpm check passed on current head: job 83852938494 completed at 2026-06-27T21:41:06Z.
  • GitHub checks observed on current head: pnpm check, CodeQL, Windows launcher, Socket checks, and later github activity to openclaw/notify runs passed. One earlier notify run was skipped before later notify successes.

@clawsweeper

clawsweeper Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 27, 2026, 5:51 PM ET / 21:51 UTC.

Summary
The branch adds durable cursor rotation for untargeted proof-nudges and bot-proof scans, workflow wiring for processed limits and exact per-target cursor publication, docs, and regression tests.

Reproducibility: yes. source-based rather than live: current main sorts proof-lane candidates deterministically and stops at processedLimit without cursor state, so repeated untargeted runs can re-inspect the same skip-heavy prefix. The PR body adds live CLI proof for the proposed rotated behavior.

Review metrics: 2 noteworthy metrics.

  • Changed surface: 5 files, +459/-21. The patch spans CLI source, workflow automation, docs, and tests, so both runtime behavior and scheduled publication matter before merge.
  • Proof lanes affected: 2 lanes. Both proof-nudges and bot-proof gain durable cursors and share the new workflow publication path.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Monitor the first executed scheduled proof-lane state commit for exact per-target cursor updates.

Risk before merge

  • [P1] Merging starts durable cursor publication for executed scheduled proof lanes into openclaw/clawsweeper-state; tests and mocked live CLI proof do not fully prove the first production state commit or rollback behavior.

Maintainer options:

  1. Merge With First-Run Cursor Monitoring (recommended)
    Maintainers can merge this exact-file implementation and inspect the first executed scheduled state commit for only the expected per-target cursor JSON update.
  2. Roll Out Execute Mode Later
    Maintainers can merge the code while keeping scheduled execute variables disabled until a manual dry-run and targeted execute run confirm production cursor behavior.
  3. Pause For Broader State-Publish Policy
    If maintainers want a general generated-results publication design first, pause this PR and keep proof lanes on fixed-prefix scanning for now.

Next step before merge

  • [P2] The remaining action is maintainer ownership of scheduled state rollout and first-run monitoring, not a concrete automated code repair.

Security
Cleared: No concrete security or supply-chain regression was found; the workflow keeps pinned actions and existing token patterns while publishing exact generated cursor files.

Review details

Best possible solution:

Land the cursor rotation once maintainers accept the scheduled automation rollout, then inspect the first executed scheduled proof-lane state commit for only the expected exact per-target cursor JSON update.

Do we have a high-confidence way to reproduce the issue?

Yes, source-based rather than live: current main sorts proof-lane candidates deterministically and stops at processedLimit without cursor state, so repeated untargeted runs can re-inspect the same skip-heavy prefix. The PR body adds live CLI proof for the proposed rotated behavior.

Is this the best way to solve the issue?

Yes. Durable cursor rotation that ignores targeted runs and publishes exact per-target cursor files is a narrow maintainable fix; the remaining question is rollout monitoring, not a code defect.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against ae63b16d6c74.

Label changes

Label justifications:

  • P2: This is a bounded proof automation fairness fix with limited blast radius, but it changes scheduled bot workflow behavior.
  • merge-risk: 🚨 automation: The PR changes scheduled proof handling and durable cursor publication, which green tests cannot fully validate in production state.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-fix live CLI output showing both proof lanes writing cursor files and rotating on later execute runs.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix live CLI output showing both proof lanes writing cursor files and rotating on later execute runs.
Evidence reviewed

What I checked:

  • Repository policy read: AGENTS.md was read fully; its automation-safety guidance applies because this PR changes scheduled proof handling and generated state publication. (AGENTS.md:1, ae63b16d6c74)
  • Current main fixed-prefix behavior: Current main sorts proof-lane candidates by likely status, reviewed/update time, and number, then proof-nudges processes that fixed list up to processedLimit with no cursor state. (src/clawsweeper.ts:18334, ae63b16d6c74)
  • PR cursor implementation: PR head adds cursor read/write helpers, rotates candidates after the saved cursor, ignores cursor paths for targeted runs, and writes cursor state after execute-mode processing in both proof lanes. (src/clawsweeper.ts:18379, eb444762164a)
  • Exact cursor-file workflow publish: The workflow passes per-target cursor files only for untargeted runs and publishes only files that the corresponding execute-mode lane produced. (.github/workflows/proof-nudges.yml:163, eb444762164a)
  • Focused regression coverage: Tests cover candidate rotation, execute-mode cursor writes for both lanes, targeted runs leaving cursor files absent, and exact file-level workflow publication gates. (test/proof-nudge-policy.test.ts:325, eb444762164a)
  • Real behavior proof: The PR body includes after-fix live CLI output showing both proof-nudges and bot-proof writing cursor files and rotating to a later candidate window on the next execute run. (eb444762164a)

Likely related people:

  • Dallin Romney: Blame and git history show the proof-lane candidate scan, proof-nudges workflow, and docs were introduced in the merged proof workflow work. (role: introduced behavior and adjacent workflow owner; confidence: high; commits: 4e5c4d47c83f, 4cab87ce3d7f, f337079999cc; files: src/clawsweeper.ts, .github/workflows/proof-nudges.yml, docs/proof-nudges.md)
  • Hannes Rudolph: Recent proof-policy work touched proof nudge behavior, docs, and focused tests near the same proof-handling area. (role: recent proof behavior contributor; confidence: medium; commits: bbdd0971612a; files: src/clawsweeper.ts, docs/proof-nudges.md, test/proof-nudge-policy.test.ts)
  • Vincent Koc: Recent source work touched ClawSweeper runtime behavior in the same large source file, though not the central proof-lane implementation. (role: recent adjacent source contributor; confidence: low; commits: babd6a43, a6ee02e3; files: src/clawsweeper.ts, test/sweep-workflow.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels Jun 27, 2026
@brokemac79 brokemac79 changed the title [codex] fix: rotate proof nudge scans fix: rotate proof nudge scans Jun 27, 2026
@brokemac79

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@brokemac79

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 27, 2026
@brokemac79 brokemac79 marked this pull request as ready for review June 27, 2026 20:43
@brokemac79 brokemac79 requested a review from a team as a code owner June 27, 2026 20:43
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 27, 2026
@brokemac79

Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels Jun 27, 2026
@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 27, 2026
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels Jun 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant