Skip to content

perf: cache report entries during scans#378

Open
brokemac79 wants to merge 1 commit into
openclaw:mainfrom
brokemac79:codex/report-entry-cache
Open

perf: cache report entries during scans#378
brokemac79 wants to merge 1 commit into
openclaw:mainfrom
brokemac79:codex/report-entry-cache

Conversation

@brokemac79

@brokemac79 brokemac79 commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

What Problem This Solves

ClawSweeper's apply/report scans reread the same report markdown while filtering, sorting, and rendering dashboard state. On larger queues that repeats local disk I/O and repeats frontmatter extraction for data that belongs to the same report snapshot.

Why This Change Was Made

This adds a small private ReportEntry reader that loads each markdown report once with its filename, number, path, repository, and markdown body. The apply queue now reuses that entry for active-repo filtering, item-number filtering, sort fields, and the initial apply-loop markdown. Dashboard stats now iterate the same parsed entries for open and archived reports, so final file counts no longer reread every markdown file.

The apply loop still rereads pair-counterpart reports from disk when checking same-author pair closeability, because those checks can depend on report mutations earlier in the same apply run.

User Impact

Apply and dashboard runs do less repeated filesystem work as report queues grow, without changing close gates, review decisions, dashboard rows, or public output.

Evidence

Head proofed: 9dc7fed32df0413c46d0467f9b04ad279238d1fe.

  • Duplicate scan before opening found no open PR for report entry cache, reportEntriesForDir, or dashboard cache work.
  • pnpm install --frozen-lockfile completed and installed the pinned dev toolchain. corepack enable printed a Windows Program Files permission warning first, but pnpm itself completed successfully.
  • pnpm exec oxfmt --write src/clawsweeper.ts passed.
  • pnpm run build passed.
  • pnpm run build:all passed.
  • pnpm run lint passed.
  • pnpm run check:active-surface passed.
  • pnpm run check:limits passed.
  • pnpm run format:check passed.
  • node --test test/clawsweeper.test.ts passed: 58 tests, 58 pass.
  • codex review -c service_tier='fast' --uncommitted reported no actionable correctness issues.
  • Runtime apply proof after the cache change used a rebuilt dist/clawsweeper.js and a realistic temp report tree with two open reports. The run exercises the cached report-entry apply scan, applies active-repo filtering and queue processing, exits 0, and the failing mock gh shim was never invoked:
$ node dist/clawsweeper.js apply-decisions --target-repo openclaw/clawsweeper --items-dir %TEMP%\clawsweeper-report-entry-proof-...\items --closed-dir %TEMP%\clawsweeper-report-entry-proof-...\closed --plans-dir %TEMP%\clawsweeper-report-entry-proof-...\plans --report-path %TEMP%\clawsweeper-report-entry-proof-...\apply-report.json --dry-run --limit 10 --processed-limit 2 --close-delay-ms 0 --progress-every 1
[apply] 2026-06-27T20:43:56.285Z starting apply: files=2 dry_run=true apply_kind=issue min_age=0 days apply_close_reasons=all stale_min_age_days=60 close_delay_ms=0 sync_comments_only=false comment_sync_min_age_days=0 max_runtime_ms=0 item_numbers=all closed=0/10 processed=0/2 counts={}
[apply] 2026-06-27T20:43:56.287Z skipped #321: review lacks verified local checkout access closed=0/10 processed=1/2 counts={"kept_open":1}
[apply] 2026-06-27T20:43:56.288Z skipped #322: review lacks verified local checkout access closed=0/10 processed=2/2 counts={"kept_open":2}
[apply] 2026-06-27T20:43:56.289Z finished apply closed=0/10 processed=2/2 counts={"kept_open":2}
[
  {
    "number": 321,
    "action": "kept_open",
    "reason": "review lacks verified local checkout access"
  },
  {
    "number": 322,
    "action": "kept_open",
    "reason": "review lacks verified local checkout access"
  }
]
APPLY_EXIT=0
GH_LOG_EXISTS=False
APPLY_REPORT_JSON:
[
  {
    "number": 321,
    "action": "kept_open",
    "reason": "review lacks verified local checkout access"
  },
  {
    "number": 322,
    "action": "kept_open",
    "reason": "review lacks verified local checkout access"
  }
]
  • Runtime dashboard proof after the cache change ran the rebuilt dashboard command in a throwaway copy of dist/, config/, and README.md so the working tree README was not touched. It used two verified open reports and one archived closed report; the mock gh fallback proves the generated dashboard came from local cached report entries:
$ node %TEMP%\clawsweeper-dashboard-cache-proof-...\runtime\dist\clawsweeper.js dashboard --target-repo openclaw/clawsweeper --items-dir %TEMP%\clawsweeper-dashboard-cache-proof-...\items --closed-dir %TEMP%\clawsweeper-dashboard-cache-proof-...\closed
[dashboard] failed to fetch open item counts for openclaw/clawsweeper; using local record counts: Command failed: C:\Program Files\nodejs\node.exe %TEMP%\clawsweeper-dashboard-cache-proof-...\bin\gh-mock.js api graphql -f query=query { repository(owner: "openclaw", name: "clawsweeper") { issues(states: OPEN) { totalCount } pullRequests(states: OPEN) { totalCount } } }
intentional dashboard proof gh failure; using local record fallback

DASHBOARD_EXIT=0
GH_FALLBACK_CALLS=1
GENERATED_DASHBOARD_LINES:
| Reviewed files | 2 |
| Work candidates awaiting promotion | 2 |
| Closed by Codex apply | 1 |
| Archived closed files | 1 |
| [openclaw/clawsweeper](https://github.com/openclaw/clawsweeper) | [#400](https://github.com/openclaw/clawsweeper/issues/400) | Cache dashboard closed proof 400 | fixed | Jun 27, 2026, 20:45 UTC | [../closed/400.md](https://github.com/openclaw/clawsweeper/blob/main/../closed/400.md) |
| [openclaw/clawsweeper](https://github.com/openclaw/clawsweeper) | [#322](https://github.com/openclaw/clawsweeper/pull/322) | Cache dashboard proof 322 | medium | candidate | Jun 27, 2026, 20:41 UTC | _pending_ | [../items/322.md](https://github.com/openclaw/clawsweeper/blob/main/../items/322.md) |
| [openclaw/clawsweeper](https://github.com/openclaw/clawsweeper) | [#321](https://github.com/openclaw/clawsweeper/issues/321) | Cache dashboard proof 321 | medium | candidate | Jun 27, 2026, 20:40 UTC | [records/openclaw-clawsweeper/plans/321.md](https://github.com/openclaw/clawsweeper/blob/main/records/openclaw-clawsweeper/plans/321.md) | [../items/321.md](https://github.com/openclaw/clawsweeper/blob/main/../items/321.md) |

Local full pnpm run test:unit on this main-based Windows checkout is still blocked by the pre-existing Windows portability failures being handled in #376: Codex proof tests hit local config/service-tier handling, one command test observes CRLF output, and the POSIX mode test is not portable on Windows main. This branch does not depend on PR #376 and has no source-file overlap with it; GitHub CI should provide the full Linux pnpm check proof for this PR.

@clawsweeper

clawsweeper Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 27, 2026, 4:52 PM ET / 20:52 UTC.

Summary
The PR adds a private ReportEntry cache in src/clawsweeper.ts and reuses parsed report markdown in apply queue and dashboard scans.

Reproducibility: not applicable. this is a performance refactor rather than a bug report with a failing reproduction. Source review and PR proof focus on preserving apply/dashboard output while reducing repeated reads.

Review metrics: 2 noteworthy metrics.

  • Changed surface: 1 file modified, +63/-56. The refactor is confined to the main ClawSweeper source file and does not touch dependencies, workflows, or generated state.
  • Runtime proof paths: 2 CLI flows shown. The PR body now covers both affected runtime surfaces: apply scan processing and dashboard generation.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • none.

Next step before merge

  • [P2] No repair-lane work is needed; the remaining action is ordinary maintainer review of the clean, proof-backed PR.

Security
Cleared: No security or supply-chain concern found; the diff only refactors local markdown report reads in src/clawsweeper.ts.

Review details

Best possible solution:

Land the focused cache refactor after ordinary maintainer review, keeping apply/dashboard semantics and the flat report layout unchanged.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a performance refactor rather than a bug report with a failing reproduction. Source review and PR proof focus on preserving apply/dashboard output while reducing repeated reads.

Is this the best way to solve the issue?

Yes: a private report-entry helper keeps the cache local to existing scans without adding API or configuration surface. Current main does not already provide this optimization.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against ae63b16d6c74.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-change terminal output from realistic apply dry-run and dashboard runs that exercise the cached report-entry paths.
  • add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-change terminal output from realistic apply dry-run and dashboard runs that exercise the cached report-entry paths.
  • remove status: 📣 needs proof: Current PR status label is status: 👀 ready for maintainer look.
  • remove merge-risk: 🚨 automation: Current PR review selected no merge-risk labels.
  • remove rating: 🦪 silver shellfish: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.

Label justifications:

  • P3: This is a low-risk internal performance cleanup with no claimed user-facing behavior change.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-change terminal output from realistic apply dry-run and dashboard runs that exercise the cached report-entry paths.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-change terminal output from realistic apply dry-run and dashboard runs that exercise the cached report-entry paths.
Evidence reviewed

What I checked:

  • Repository policy read: Read the full target AGENTS.md; its narrow, evidence-backed, automation-safe guidance applies because this PR changes apply/dashboard automation in the main ClawSweeper source. (AGENTS.md:1, ae63b16d6c74)
  • Current main still has uncached reads: Current main builds apply queue entries by reading each report during filtering and again for sort fields, so the cache refactor is not already implemented on main. (src/clawsweeper.ts:16921, ae63b16d6c74)
  • PR apply scan change: The PR head filters cached ReportEntry values by target repo and requested item number, then reuses entry.markdown and entry.number in the apply loop. (src/clawsweeper.ts:16921, 9dc7fed32df0)
  • PR dashboard scan change: The PR head constructs open and closed dashboard stats from cached entries and derives report paths and counts from those entries. (src/clawsweeper.ts:19380, 9dc7fed32df0)
  • Proof and checks: The PR body now includes after-change terminal output for an apply dry-run and a dashboard run on realistic temp report trees, and GitHub reports successful pnpm check, CodeQL, Socket, and notifier checks for the reviewed head. (9dc7fed32df0)
  • Feature history: Blame on the current apply queue, dashboard stats, and markdown repository helper points to commit 4e5c4d4, and broader git history shows earlier apply/dashboard work in src/clawsweeper.ts. (src/clawsweeper.ts:16921, 4e5c4d47c83f)

Likely related people:

  • RomneyDa: Current-main blame for the apply queue, dashboard stats, and markdown repository helper traces to commit 4e5c4d4, merged through fix: rerender advisory label review comments #315. (role: current implementation author; confidence: high; commits: 4e5c4d47c83f; files: src/clawsweeper.ts)
  • steipete: Git history shows the proposal apply mode and fleet dashboard work in src/clawsweeper.ts came from earlier commits by this contributor. (role: original apply/dashboard area contributor; confidence: medium; commits: 3c92721f065c, 849e6a9eb161; files: src/clawsweeper.ts)
  • brokemac79: Recent merged work in [codex] Route PR close reviews to autoclose #371 touched adjacent trusted close/apply-review automation in src/clawsweeper.ts, so this contributor is connected to the current area beyond only opening this PR. (role: recent adjacent contributor; confidence: medium; commits: 84796f0c9b0c; files: src/clawsweeper.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P3 Low-risk cleanup, docs, polish, ergonomics, or speculative feature. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. and removed proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels Jun 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low-risk cleanup, docs, polish, ergonomics, or speculative feature. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant