perf: cache report entries during scans by brokemac79 · Pull Request #378 · openclaw/clawsweeper

brokemac79 · 2026-06-27T20:22:27Z

What Problem This Solves

ClawSweeper's apply/report scans reread the same report markdown while filtering, sorting, and rendering dashboard state. On larger queues that repeats local disk I/O and repeats frontmatter extraction for data that belongs to the same report snapshot.

Why This Change Was Made

This adds a small private ReportEntry reader that loads each markdown report once with its filename, number, path, repository, and markdown body. The apply queue now reuses that entry for active-repo filtering, item-number filtering, sort fields, and the initial apply-loop markdown. Dashboard stats now iterate the same parsed entries for open and archived reports, so final file counts no longer reread every markdown file.

The apply loop still rereads pair-counterpart reports from disk when checking same-author pair closeability, because those checks can depend on report mutations earlier in the same apply run.

User Impact

Apply and dashboard runs do less repeated filesystem work as report queues grow, without changing close gates, review decisions, dashboard rows, or public output.

Evidence

Head proofed: 9dc7fed32df0413c46d0467f9b04ad279238d1fe.

Duplicate scan before opening found no open PR for report entry cache, reportEntriesForDir, or dashboard cache work.
pnpm install --frozen-lockfile completed and installed the pinned dev toolchain. corepack enable printed a Windows Program Files permission warning first, but pnpm itself completed successfully.
pnpm exec oxfmt --write src/clawsweeper.ts passed.
pnpm run build passed.
pnpm run build:all passed.
pnpm run lint passed.
pnpm run check:active-surface passed.
pnpm run check:limits passed.
pnpm run format:check passed.
node --test test/clawsweeper.test.ts passed: 58 tests, 58 pass.
codex review -c service_tier='fast' --uncommitted reported no actionable correctness issues.
Runtime apply proof after the cache change used a rebuilt dist/clawsweeper.js and a realistic temp report tree with two open reports. The run exercises the cached report-entry apply scan, applies active-repo filtering and queue processing, exits 0, and the failing mock gh shim was never invoked:

$ node dist/clawsweeper.js apply-decisions --target-repo openclaw/clawsweeper --items-dir %TEMP%\clawsweeper-report-entry-proof-...\items --closed-dir %TEMP%\clawsweeper-report-entry-proof-...\closed --plans-dir %TEMP%\clawsweeper-report-entry-proof-...\plans --report-path %TEMP%\clawsweeper-report-entry-proof-...\apply-report.json --dry-run --limit 10 --processed-limit 2 --close-delay-ms 0 --progress-every 1
[apply] 2026-06-27T20:43:56.285Z starting apply: files=2 dry_run=true apply_kind=issue min_age=0 days apply_close_reasons=all stale_min_age_days=60 close_delay_ms=0 sync_comments_only=false comment_sync_min_age_days=0 max_runtime_ms=0 item_numbers=all closed=0/10 processed=0/2 counts={}
[apply] 2026-06-27T20:43:56.287Z skipped #321: review lacks verified local checkout access closed=0/10 processed=1/2 counts={"kept_open":1}
[apply] 2026-06-27T20:43:56.288Z skipped #322: review lacks verified local checkout access closed=0/10 processed=2/2 counts={"kept_open":2}
[apply] 2026-06-27T20:43:56.289Z finished apply closed=0/10 processed=2/2 counts={"kept_open":2}
[
  {
    "number": 321,
    "action": "kept_open",
    "reason": "review lacks verified local checkout access"
  },
  {
    "number": 322,
    "action": "kept_open",
    "reason": "review lacks verified local checkout access"
  }
]
APPLY_EXIT=0
GH_LOG_EXISTS=False
APPLY_REPORT_JSON:
[
  {
    "number": 321,
    "action": "kept_open",
    "reason": "review lacks verified local checkout access"
  },
  {
    "number": 322,
    "action": "kept_open",
    "reason": "review lacks verified local checkout access"
  }
]

Runtime dashboard proof after the cache change ran the rebuilt dashboard command in a throwaway copy of dist/, config/, and README.md so the working tree README was not touched. It used two verified open reports and one archived closed report; the mock gh fallback proves the generated dashboard came from local cached report entries:

$ node %TEMP%\clawsweeper-dashboard-cache-proof-...\runtime\dist\clawsweeper.js dashboard --target-repo openclaw/clawsweeper --items-dir %TEMP%\clawsweeper-dashboard-cache-proof-...\items --closed-dir %TEMP%\clawsweeper-dashboard-cache-proof-...\closed
[dashboard] failed to fetch open item counts for openclaw/clawsweeper; using local record counts: Command failed: C:\Program Files\nodejs\node.exe %TEMP%\clawsweeper-dashboard-cache-proof-...\bin\gh-mock.js api graphql -f query=query { repository(owner: "openclaw", name: "clawsweeper") { issues(states: OPEN) { totalCount } pullRequests(states: OPEN) { totalCount } } }
intentional dashboard proof gh failure; using local record fallback

DASHBOARD_EXIT=0
GH_FALLBACK_CALLS=1
GENERATED_DASHBOARD_LINES:
| Reviewed files | 2 |
| Work candidates awaiting promotion | 2 |
| Closed by Codex apply | 1 |
| Archived closed files | 1 |
| [openclaw/clawsweeper](https://github.com/openclaw/clawsweeper) | [#400](https://github.com/openclaw/clawsweeper/issues/400) | Cache dashboard closed proof 400 | fixed | Jun 27, 2026, 20:45 UTC | [../closed/400.md](https://github.com/openclaw/clawsweeper/blob/main/../closed/400.md) |
| [openclaw/clawsweeper](https://github.com/openclaw/clawsweeper) | [#322](https://github.com/openclaw/clawsweeper/pull/322) | Cache dashboard proof 322 | medium | candidate | Jun 27, 2026, 20:41 UTC | _pending_ | [../items/322.md](https://github.com/openclaw/clawsweeper/blob/main/../items/322.md) |
| [openclaw/clawsweeper](https://github.com/openclaw/clawsweeper) | [#321](https://github.com/openclaw/clawsweeper/issues/321) | Cache dashboard proof 321 | medium | candidate | Jun 27, 2026, 20:40 UTC | [records/openclaw-clawsweeper/plans/321.md](https://github.com/openclaw/clawsweeper/blob/main/records/openclaw-clawsweeper/plans/321.md) | [../items/321.md](https://github.com/openclaw/clawsweeper/blob/main/../items/321.md) |

Local full pnpm run test:unit on this main-based Windows checkout is still blocked by the pre-existing Windows portability failures being handled in #376: Codex proof tests hit local config/service-tier handling, one command test observes CRLF output, and the POSIX mode test is not portable on Windows main. This branch does not depend on PR #376 and has no source-file overlap with it; GitHub CI should provide the full Linux pnpm check proof for this PR.

clawsweeper · 2026-06-27T20:23:21Z

Codex review: needs maintainer review before merge. Reviewed June 27, 2026, 4:52 PM ET / 20:52 UTC.

Summary
The PR adds a private ReportEntry cache in src/clawsweeper.ts and reuses parsed report markdown in apply queue and dashboard scans.

Reproducibility: not applicable. this is a performance refactor rather than a bug report with a failing reproduction. Source review and PR proof focus on preserving apply/dashboard output while reducing repeated reads.

Review metrics: 2 noteworthy metrics.

Changed surface: 1 file modified, +63/-56. The refactor is confined to the main ClawSweeper source file and does not touch dependencies, workflows, or generated state.
Runtime proof paths: 2 CLI flows shown. The PR body now covers both affected runtime surfaces: apply scan processing and dashboard generation.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

none.

Next step before merge

[P2] No repair-lane work is needed; the remaining action is ordinary maintainer review of the clean, proof-backed PR.

Security
Cleared: No security or supply-chain concern found; the diff only refactors local markdown report reads in src/clawsweeper.ts.

Review details

Best possible solution:

Land the focused cache refactor after ordinary maintainer review, keeping apply/dashboard semantics and the flat report layout unchanged.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a performance refactor rather than a bug report with a failing reproduction. Source review and PR proof focus on preserving apply/dashboard output while reducing repeated reads.

Is this the best way to solve the issue?

Yes: a private report-entry helper keeps the cache local to existing scans without adding API or configuration surface. Current main does not already provide this optimization.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against ae63b16d6c74.

Label changes

Label changes:

add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-change terminal output from realistic apply dry-run and dashboard runs that exercise the cached report-entry paths.
add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-change terminal output from realistic apply dry-run and dashboard runs that exercise the cached report-entry paths.
remove status: 📣 needs proof: Current PR status label is status: 👀 ready for maintainer look.
remove merge-risk: 🚨 automation: Current PR review selected no merge-risk labels.
remove rating: 🦪 silver shellfish: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.

Label justifications:

P3: This is a low-risk internal performance cleanup with no claimed user-facing behavior change.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-change terminal output from realistic apply dry-run and dashboard runs that exercise the cached report-entry paths.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-change terminal output from realistic apply dry-run and dashboard runs that exercise the cached report-entry paths.

Evidence reviewed

What I checked:

Repository policy read: Read the full target AGENTS.md; its narrow, evidence-backed, automation-safe guidance applies because this PR changes apply/dashboard automation in the main ClawSweeper source. (AGENTS.md:1, ae63b16d6c74)
Current main still has uncached reads: Current main builds apply queue entries by reading each report during filtering and again for sort fields, so the cache refactor is not already implemented on main. (src/clawsweeper.ts:16921, ae63b16d6c74)
PR apply scan change: The PR head filters cached ReportEntry values by target repo and requested item number, then reuses entry.markdown and entry.number in the apply loop. (src/clawsweeper.ts:16921, 9dc7fed32df0)
PR dashboard scan change: The PR head constructs open and closed dashboard stats from cached entries and derives report paths and counts from those entries. (src/clawsweeper.ts:19380, 9dc7fed32df0)
Proof and checks: The PR body now includes after-change terminal output for an apply dry-run and a dashboard run on realistic temp report trees, and GitHub reports successful pnpm check, CodeQL, Socket, and notifier checks for the reviewed head. (9dc7fed32df0)
Feature history: Blame on the current apply queue, dashboard stats, and markdown repository helper points to commit 4e5c4d4, and broader git history shows earlier apply/dashboard work in src/clawsweeper.ts. (src/clawsweeper.ts:16921, 4e5c4d47c83f)

Likely related people:

RomneyDa: Current-main blame for the apply queue, dashboard stats, and markdown repository helper traces to commit 4e5c4d4, merged through fix: rerender advisory label review comments #315. (role: current implementation author; confidence: high; commits: 4e5c4d47c83f; files: src/clawsweeper.ts)
steipete: Git history shows the proposal apply mode and fleet dashboard work in src/clawsweeper.ts came from earlier commits by this contributor. (role: original apply/dashboard area contributor; confidence: medium; commits: 3c92721f065c, 849e6a9eb161; files: src/clawsweeper.ts)
brokemac79: Recent merged work in [codex] Route PR close reviews to autoclose #371 touched adjacent trusted close/apply-review automation in src/clawsweeper.ts, so this contributor is connected to the current area beyond only opening this PR. (role: recent adjacent contributor; confidence: medium; commits: 84796f0c9b0c; files: src/clawsweeper.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

perf: cache report entries during scans

9dc7fed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: cache report entries during scans#378

perf: cache report entries during scans#378
brokemac79 wants to merge 1 commit into
openclaw:mainfrom
brokemac79:codex/report-entry-cache

brokemac79 commented Jun 27, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

brokemac79 commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What Problem This Solves

Why This Change Was Made

User Impact

Evidence

Uh oh!

clawsweeper Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

brokemac79 commented Jun 27, 2026 •

edited

Loading

clawsweeper Bot commented Jun 27, 2026 •

edited

Loading