feat(skills): progressive disclosure (replace full-body load, ~9k tokens/turn savings) by lxistired · Pull Request #172 · OpenCoworkAI/open-cowork

lxistired · 2026-04-28T11:31:48Z

Problem

DefaultResourceLoader embeds every SKILL.md body into the system prompt on every turn via additionalSkillPaths. With the 5 bundled skills (xlsx/pdf/docx/pptx/skill-creator), each turn carries an extra ~9k input tokens — even when the user is just asking a plain text question. The cost compounds for long sessions, and the user pays for content the model rarely needs.

Fix — Claude Code-style progressive disclosure

New module src/main/skills/skill-progressive-loader.ts:
- scanSkillFrontmatter — reads only the YAML frontmatter at the top of each SKILL.md (~250 chars/skill instead of full body).
- buildSkillListing — emits a compact <available_skills> block (bundled first, custom truncated first when over budget). Total budget capped at min(20k chars, 1% × contextWindow).
- buildSkillTool — exposes a Skill function tool. Model calls Skill(skill_name='xlsx') and the SKILL.md body is read on demand and returned as the tool result. Loaded names tracked in a per-session Set so the cost is paid once per skill per session.
agent-runner.ts integration:
- Scan skillPaths frontmatter once, append <available_skills> to the appendSystemPrompt block.
- Pass additionalSkillPaths: [] to DefaultResourceLoader so it no longer embeds bodies.
- Push buildSkillTool into mcpCustomTools when at least one skill is discovered.
- Persist sessionLoadedSkills in CachedPiSession so reused sessions keep tracking what's already been loaded into history.
System prompt rule 6 — eager-load policy: when the user mentions a skill name or implies its domain, the model loads it in the same turn instead of asking 'should I load it?'. Loading is cheap; the user shouldn't have to confirm twice.

Cost impact

gpt-5.5 / claude-sonnet-4 typical pricing (~$3/Mtok input):

9k tokens × $3/Mtok × 20-turn session ≈ $0.54 saved per session when no skill is needed.
~$0.40 saved per session when one skill is loaded (amortized over remaining turns).

Files

src/main/skills/skill-progressive-loader.ts (new, 250 lines)
src/main/claude/agent-runner.ts (+54/-10)

Test plan

tsc --noEmit passes
First turn: <available_skills> block visible in system prompt, full SKILL.md bodies are NOT
User asks 'do you have the xlsx skill?' → model loads it in same turn (rule 6.b)
Same skill loaded twice in one session → second call still returns content (no false-block)
No skills installed → no <available_skills> block emitted, no Skill tool registered
Custom skills work the same as bundled (just no priority in budget gate)

…ens/turn savings) DefaultResourceLoader currently embeds every SKILL.md body into the system prompt every turn via additionalSkillPaths. With 5 bundled skills (xlsx/pdf/docx/pptx/skill-creator) this costs ~9k input tokens per turn, repeated for the entire conversation. Long sessions amplify the cost — and the user pays for content the model rarely needs. Switch to Claude Code-style progressive disclosure: 1. New module src/main/skills/skill-progressive-loader.ts: - scanSkillFrontmatter — read only the YAML frontmatter at the top of each SKILL.md (~250 chars/skill instead of full body). - buildSkillListing — emit a compact <available_skills> block (bundled first, custom truncated when over budget). Total cap min(20k, 1% × ctx_window). - buildSkillTool — exposes a 'Skill' function tool. Model calls Skill(skill_name='xlsx') and the SKILL.md body is read on demand and returned as the tool result. Loaded names tracked in a per-session Set so the cost is paid once per skill per session. 2. agent-runner.ts integration: - Scan skillPaths frontmatter once, append <available_skills> to the appendSystemPrompt block. - Pass additionalSkillPaths: [] to DefaultResourceLoader so it does NOT embed bodies any more. - Push buildSkillTool into mcpCustomTools when ≥1 skill found. - Persist sessionLoadedSkills in CachedPiSession so reused sessions keep tracking what's already been loaded into history. 3. System prompt rule 6 — eager-load policy: when the user mentions a skill name or implies its domain, the model loads it in the same turn instead of asking 'should I load it?'. Loading is cheap, the user shouldn't have to confirm twice. Cost impact (gpt-5.5 / claude-sonnet-4 typical pricing): ~9k tokens × $3/Mtok input × 20-turn session ≈ $0.54 saved per session per user when no skill is needed; ~$0.40 saved when one skill is loaded (amortized over remaining turns).

github-actions

Findings

[Major] The advertised "COMPLETE" skill list stops being complete once the budget is exceeded. buildSkillListing() drops custom skills entirely when even the - name fallback no longer fits, but the prompt still tells the model to answer availability from a complete installed-skill index. That can produce false "not available" answers for real installed skills. Evidence src/main/skills/skill-progressive-loader.ts:173, src/main/skills/skill-progressive-loader.ts:188.
[Minor] Bundled-skill precedence is checked against the directory name, while the seen map is keyed by the frontmatter name. A custom skill whose folder name differs from its declared name: can still overwrite a bundled skill with the same logical name, despite the "bundled wins" rule. Evidence src/main/skills/skill-progressive-loader.ts:120, src/main/skills/skill-progressive-loader.ts:127.

Summary

Review mode: initial

Found 2 issues in the progressive skill loader. The main regression is incorrect skill-availability answers when the listing is truncated under budget. Not found in repo/docs: PR-head tests covering budget truncation or duplicate-name precedence.

Testing

Not run (automation)

Open Cowork Bot

github-actions · 2026-04-30T09:30:50Z

+
+  return [
+    '<available_skills>',
+    `Below is the COMPLETE list of skills installed locally (${lines.length} total). When the user asks "do you have skill X" or "can you use X", you MUST scan this list and answer based on it — do not hallucinate or list a hardcoded subset. To load a skill's full instructions, call the Skill tool with the skill_name argument. Loaded content stays in history; no need to reload.`,


[MAJOR] Once this loop runs out of budget, some custom skills disappear from the listing entirely, but the prompt right below still calls the result the "COMPLETE" installed-skill list. That means the model can answer "not available" for a real installed skill whenever the user has enough custom skills to overflow the budget.

Suggested fix:

const lineMin = `- ${entry.name}`; const nextLine = usedChars + overhead <= totalBudget || entry.bundled ? lineFull : lineMin; lines.push(nextLine); usedChars += nextLine.length + 1;

github-actions · 2026-04-30T09:30:51Z

+      const filePath = path.join(rootDir, skillName, 'SKILL.md');
+      if (!fs.existsSync(filePath)) continue;
+      // bundled wins over user when same name appears
+      if (seen.has(skillName) && !bundled) continue;


[MINOR] The precedence check is using skillName (the folder name), but the map is keyed by the parsed frontmatter name. If a custom skill lives in a differently named folder and declares the same logical name: as a bundled skill, this path will still overwrite the bundled entry.

Suggested fix:

const existing = seen.get(name); if (existing && (existing.bundled || !bundled)) continue; seen.set(name, { name, description: truncateToChars(description, MAX_LISTING_DESC_CHARS), filePath, bundled, rootDir, });

hqhq1025 added bot-rerun Temporary label for rerunning bot automation and removed bot-rerun Temporary label for rerunning bot automation labels Apr 30, 2026

github-actions Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): progressive disclosure (replace full-body load, ~9k tokens/turn savings)#172

feat(skills): progressive disclosure (replace full-body load, ~9k tokens/turn savings)#172
lxistired wants to merge 1 commit into
OpenCoworkAI:mainfrom
lxistired:feat/progressive-skills-disclosure

lxistired commented Apr 28, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Apr 30, 2026

Uh oh!

github-actions Bot Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lxistired commented Apr 28, 2026

Problem

Fix — Claude Code-style progressive disclosure

Cost impact

Files

Test plan

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants