Skip to content

Claim command dispatches before execution#380

Open
ag-linden wants to merge 1 commit into
openclaw:mainfrom
ag-linden:fix/comment-router-dispatch-idempotency
Open

Claim command dispatches before execution#380
ag-linden wants to merge 1 commit into
openclaw:mainfrom
ag-linden:fix/comment-router-dispatch-idempotency

Conversation

@ag-linden

Copy link
Copy Markdown

Summary

  • add a durable claimed command-ledger state for dispatch-bearing comment-router commands
  • write that claim after mutation/capacity preflight and before ack comments or worker/review dispatches can run
  • keep the normal completion ledger update after command side effects finish, upgrading claimed entries to executed or waiting

Root cause

The comment router previously executed ready commands before appending them to results/comment-router.json. A command that emitted a repository_dispatch or workflow dispatch could therefore be replayed if the process failed before the final ledger write.

Why it matters

The local ledger is the durable idempotency boundary for maintainer commands, while GitHub dispatch APIs own the external side effect. Claiming dispatch-bearing command versions before emitting external work keeps retry runs from launching duplicate review/repair workers after a later failure.

Validation

  • pnpm run build:repair
  • /Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin/node --test test/repair/comment-router-utils.test.ts test/repair/comment-router-core.test.ts
  • PATH=/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/bin:$PATH pnpm run test:repair
  • PATH=/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/bin:$PATH pnpm run lint:repair
  • PATH=/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/bin:$PATH pnpm run lint:scripts
  • PATH=/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/bin:$PATH pnpm run format:check
  • PATH=/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/bin:$PATH pnpm run check:active-surface
  • PATH=/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:/Users/alexgild/.cache/codex-runtimes/codex-primary-runtime/dependencies/bin:$PATH pnpm run check:limits

Maintainer note

This makes dispatch-bearing commands at-most-once across crashes by treating the pre-dispatch claim as terminal for replay suppression. If automatic recovery for the small pre-dispatch crash window is important, that can be added separately as an explicit stale-claim recovery path rather than replaying commands by default.

@clawsweeper

clawsweeper Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 29, 2026, 4:17 PM ET / 20:17 UTC.

Summary
The PR adds a claimed command-ledger state for dispatch-bearing comment-router commands, writes that claim before acknowledgements and dispatch, and adds focused tests for claim suppression and final ledger upgrades.

Reproducibility: yes. source inspection gives a high-confidence current-main path: dispatch-bearing commands execute before the ledger is appended and written, so a failure after dispatch but before the final write can replay the same command version. I did not run a destructive live dispatch.

Review metrics: 2 noteworthy metrics.

  • Diff footprint: 4 files changed, +140/-4. The patch is small enough to review directly, but it touches command-router execution and durable ledger behavior.
  • Claimed dispatch actions: 3 action names. dispatch_clawsweeper, dispatch_repair, and dispatch_assist are the external-work paths whose retry behavior changes after merge.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted terminal or log proof from a real comment-router execution or controlled crash/retry run showing the claimed ledger entry and final executed or waiting upgrade.
  • [P1] Get maintainer acceptance of the pre-dispatch loss window or add stale-claim recovery before merge.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR body lists tests and checks only; add redacted terminal output, logs, or an artifact from a real comment-router claim/dispatch run, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.

Risk before merge

  • [P1] A crash after the new writeLedger claim but before acknowledgements, job creation, or external dispatch would now suppress the same command version on retry, so a maintainer command can be lost unless maintainers explicitly accept at-most-once semantics or add stale-claim recovery.
  • [P1] The PR body provides validation commands but no real after-fix router run or crash/retry proof, so the changed operational behavior is not yet proven outside tests/source inspection.

Maintainer options:

  1. Add stale-claim recovery
    Require a bounded recovery path that can detect a claimed-but-not-dispatched command and safely retry or surface it without launching duplicate workers.
  2. Accept at-most-once commands
    A maintainer may intentionally accept that rare pre-dispatch crashes suppress the command rather than risk duplicate review or repair workers.
  3. Pause for live proof
    Hold the PR until the contributor shows redacted real router output for the claim, dispatch, and final ledger-upgrade path.

Next step before merge

  • [P1] Human review is needed for the missing real behavior proof and the intentional pre-dispatch crash-window tradeoff; there is no narrow repair marker to dispatch yet.

Security
Cleared: The diff touches GitHub dispatch and ledger code but adds no dependencies, workflow permission changes, token handling changes, or third-party execution path.

Review details

Best possible solution:

Land this only with maintainer-approved at-most-once semantics, or add stale-claim recovery so duplicate dispatches are prevented without silently dropping commands after a pre-dispatch failure.

Do we have a high-confidence way to reproduce the issue?

Yes, source inspection gives a high-confidence current-main path: dispatch-bearing commands execute before the ledger is appended and written, so a failure after dispatch but before the final write can replay the same command version. I did not run a destructive live dispatch.

Is this the best way to solve the issue?

Unclear. The implementation is narrow and tested, but treating claimed as terminal introduces an acknowledged pre-dispatch command-loss window that needs maintainer acceptance or a stale-claim recovery design before merge.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against ae63b16d6c74.

Label changes

Label changes:

  • add P2: This is a bounded repair-lane automation reliability fix with limited blast radius, not an emergency or active core-runtime outage.
  • add merge-risk: 🚨 automation: Merging changes the comment-router idempotency boundary and can suppress a dispatch-bearing command after a pre-dispatch crash.
  • add rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body lists tests and checks only; add redacted terminal output, logs, or an artifact from a real comment-router claim/dispatch run, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.

Label justifications:

  • P2: This is a bounded repair-lane automation reliability fix with limited blast radius, not an emergency or active core-runtime outage.
  • merge-risk: 🚨 automation: Merging changes the comment-router idempotency boundary and can suppress a dispatch-bearing command after a pre-dispatch crash.
  • rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body lists tests and checks only; add redacted terminal output, logs, or an artifact from a real comment-router claim/dispatch run, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.
Evidence reviewed

What I checked:

  • AGENTS policy: AGENTS.md was read fully; its repair-lane, narrow-test-placement, and automation-safety guidance applies to this comment-router change. (AGENTS.md:1, ae63b16d6c74)
  • Current main dispatch-before-ledger behavior: On current main, executeCommand(command) runs inside the execute block before appendLedger and writeLedger, so a post-dispatch/pre-ledger failure can leave no terminal ledger entry for the command version. (src/repair/comment-router.ts:294, ae63b16d6c74)
  • PR claim implementation: At PR head, the router writes report.ledger_claimed and persists the ledger before ack comments and command execution; the new helper claims dispatch_clawsweeper, dispatch_repair, and dispatch_assist commands. (src/repair/comment-router.ts:305, 8e0325d6ce93)
  • Claimed entries become terminal: At PR head, shouldSuppressProcessedCommentVersion treats claimed like executed or skipped, and appendLedger persists claimed entries. (src/repair/comment-router-utils.ts:105, 8e0325d6ce93)
  • Focused tests: The PR adds tests that a claimed dispatch command suppresses replay and can later be upgraded to executed, plus a source-order check that claiming precedes ack and execution. (test/repair/comment-router-utils.test.ts:133, 8e0325d6ce93)
  • Documented command-router expectations: The repair automation docs describe maintainer commands as creating/reusing durable jobs, dispatching workers, and editing marker-backed status comments, which is the user-visible workflow affected by the claim-before-dispatch tradeoff. (docs/steerable-repair-automation.md:98, ae63b16d6c74)

Likely related people:

  • RomneyDa: Blame shows the current ledger suppression, execute ordering, and dispatch execution code came from commit 4e5c4d4, and GitHub metadata shows RomneyDa authored and merged the associated repair-lane PR. (role: introduced behavior and recent merger; confidence: high; commits: 4e5c4d47c83f; files: src/repair/comment-router.ts, src/repair/comment-router-utils.ts)
  • vincentkoc: Recent merged work by vincentkoc touched comment-router capacity and ledger/test utilities in PRs 329 and 330, which are adjacent to this dispatch-idempotency path. (role: recent adjacent contributor; confidence: medium; commits: c759f49d3c20, 25a378a5b7d9; files: src/repair/comment-router.ts, src/repair/comment-router-utils.ts, test/repair/comment-router-utils.test.ts)
  • brokemac79: PR 371 recently modified comment-router close/review automation paths, making brokemac79 relevant for adjacent routing behavior even though not the original ledger owner. (role: recent area contributor; confidence: medium; commits: 84796f0c9b0c; files: src/repair/comment-router.ts, src/repair/comment-router-core.ts, test/repair/comment-router-core.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8e0325d6ce

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +305 to +308
report.ledger_claimed = measure("claim_dispatch_commands", () =>
claimDispatchCommands(actionable),
);
if (report.ledger_claimed) writeLedger(ledgerPath(), ledger);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid suppressing commands on failed dispatch attempts

This writes a terminal claimed ledger entry before any executeCommand dispatch has completed. In repair-comment-router.yml the ledger publish step runs with if: always(), and claimed entries are included in shouldSuppressProcessedCommentVersion, so if a dispatch command throws after this write (for example gh workflow run/repository_dispatch fails or the process crashes before executeCommand reaches the dispatch), the failed claim is published and future router runs skip the same comment version instead of retrying it. That can permanently drop re-review/repair/assist requests without ever starting the worker.

Useful? React with 👍 / 👎.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. P2 Normal priority bug or improvement with limited blast radius. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant