feat(ui): show last-turn input + cache hit % in context panel (instead of cumulative) by lxistired · Pull Request #174 · OpenCoworkAI/open-cowork

lxistired · 2026-04-28T11:36:55Z

Problem

The context panel currently displays input/output token counts as cumulative sums across the entire session. After ~10 turns of normal usage, the input number reads "hundreds of thousands of tokens" — even though the cached prefix is being reused at deep discount on each turn. Users panic and assume they're being billed full rate every turn.

The context-usage bar has the inverse problem: it uses only tokenUsage.input (uncached portion), so the bar visually shrinks when a cache hit lands. But the prompt is still that big — the model still has to attend to all of it. Users see a session that feels "lighter" right when it gets fuller.

Fix

Switch the display to last-turn input + last-turn output + overall cache-hit percentage. That honestly reflects what the user is paying for.

Three changes:

types/index.ts — TokenUsage gains optional cacheRead / cacheWrite fields. The provider/SDK already produces these; the UI was discarding them.
agent-runner.ts normalizeTokenUsage — extract cacheRead / cacheWrite from common provider field-name variants:
- Anthropic: cache_read_input_tokens / cache_creation_input_tokens
- OpenAI Responses: prompt_tokens_details.cached_tokens
- pi-ai-core normalized: cacheRead / cacheWrite
ContextPanel.tsx —
- tokenUsage now exposes lastInput + lastOutput as input/output, plus aggregates (totalCacheRead, cacheHitRate) for the header.
- contextUsage bar uses lastInput + lastCacheRead (true prompt size), so the bar doesn't shrink when a cache hit lands.
- Header shows cache N% inline next to input when totalCacheRead > 0; absent for non-cache providers (existing behavior preserved).

Files

src/renderer/types/index.ts (+4)
src/renderer/components/ContextPanel.tsx (+47/-18)
src/main/claude/agent-runner.ts (normalizeTokenUsage only, +24/-3)

Test plan

tsc --noEmit passes
First message: input/output reflect that single turn (no aggregation)
After 10 turns with high cache hit rate: input shows current-turn prompt + 'cache 90%' style label, NOT 1.5M
Context bar % grows with chat length (no shrink on cache hit)
Provider with no cache info (e.g. Ollama, OpenAI Chat-Completions w/o caching): no 'cache N%' label, behavior identical to before

…d of cumulative) Header used to display input/output token counts cumulatively across the entire chat session. After ~10 turns the input number reads 'hundreds of thousands of tokens' even though the cached prefix is being reused at deep discount on each turn — users panic and think they're being billed full rate every turn. Switch the display to last-turn (current prompt size + last response size) plus an overall cache-hit percentage, which honestly reflects what the user is actually paying for. Three changes: 1. types/index.ts — TokenUsage gains optional cacheRead / cacheWrite fields (the existing provider/SDK already produces them; UI just ignored them). 2. agent-runner.ts normalizeTokenUsage — extract cacheRead / cacheWrite from common provider field-name variants: - Anthropic: cache_read_input_tokens / cache_creation_input_tokens - OpenAI Responses: prompt_tokens_details.cached_tokens - pi-ai-core normalized: cacheRead / cacheWrite 3. ContextPanel.tsx — - tokenUsage now exposes lastInput + lastOutput as 'input'/'output', plus aggregates (totalCacheRead, cacheHitRate) for the header. - contextUsage bar uses (lastInput + lastCacheRead) so the bar doesn't shrink when a cache hit lands (which would feel counterintuitive — the prompt is still that big). - Header shows 'cache N%' inline next to input when totalCacheRead > 0; absent for non-cache providers (current behavior preserved).

github-actions

Findings

[Major] TokenUsage.input is being treated as "uncached-only" in the UI, but normalizeTokenUsage() preserves provider-native semantics. That makes the new panel wrong in both directions: OpenAI-style usage already includes cached prompt tokens in the prompt total, while Anthropic-style usage splits cache writes into cache_creation_input_tokens, which the new input + cacheRead calculation never adds back. Evidence src/main/claude/agent-runner.ts:395-417, src/renderer/components/ContextPanel.tsx:117-119, src/renderer/components/ContextPanel.tsx:142-149, src/renderer/types/index.ts:95-98.
Suggested fix:
```
const cacheRead = typeof cacheReadCandidate === 'number' ? cacheReadCandidate : 0;
const cacheWrite = typeof cacheWriteCandidate === 'number' ? cacheWriteCandidate : 0;
const baseInput = raw.input ?? raw.input_tokens ?? raw.inputTokens;

const normalizedInput =
  raw.prompt_tokens_details !== undefined
    ? (baseInput as number)
    : (baseInput as number) + cacheRead + cacheWrite;

return { input: normalizedInput, output, cacheRead, cacheWrite };
```
Then use u.input directly in ContextPanel for both the header input value and contextUsage.used.

Summary

Review mode: initial. 1 issue found. CLAUDE.md was not found in repo/docs. No PR-side regression tests were added for provider-specific token accounting in the context panel.

Testing

Not run (automation). Suggested tests: Anthropic cache-write first turn, Anthropic cache-read follow-up turn, and OpenAI cached prompt totals.

Open Cowork Bot

github-actions · 2026-04-30T09:30:45Z

+  const cacheWrite = typeof cacheWriteCandidate === 'number' ? cacheWriteCandidate : undefined;
+
+  return {
+    input,


[MAJOR] ContextPanel now assumes tokenUsage.input is the uncached tail and rebuilds prompt size as input + cacheRead, but normalizeTokenUsage() is still passing provider-native values through unchanged. That breaks in both directions: OpenAI-style usage already includes cached prompt tokens in the prompt total, while Anthropic-style usage splits cache writes into cache_creation_input_tokens, which this calculation never adds back.

Suggested fix:

const cacheRead = typeof cacheReadCandidate === 'number' ? cacheReadCandidate : 0; const cacheWrite = typeof cacheWriteCandidate === 'number' ? cacheWriteCandidate : 0; const baseInput = raw.input ?? raw.input_tokens ?? raw.inputTokens; const normalizedInput = raw.prompt_tokens_details !== undefined ? (baseInput as number) : (baseInput as number) + cacheRead + cacheWrite; return { input: normalizedInput, output, cacheRead, cacheWrite };

Then use u.input directly in ContextPanel for both the header input value and contextUsage.used.

hqhq1025 added bot-rerun Temporary label for rerunning bot automation and removed bot-rerun Temporary label for rerunning bot automation labels Apr 30, 2026

github-actions Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ui): show last-turn input + cache hit % in context panel (instead of cumulative)#174

feat(ui): show last-turn input + cache hit % in context panel (instead of cumulative)#174
lxistired wants to merge 1 commit into
OpenCoworkAI:mainfrom
lxistired:feat/token-ui-cache-display

lxistired commented Apr 28, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lxistired commented Apr 28, 2026

Problem

Fix

Files

Test plan

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants