fix(agent-runner): forward pasted image content to vision-capable models by lxistired · Pull Request #168 · OpenCoworkAI/open-cowork

lxistired · 2026-04-28T11:19:29Z

Summary

When a user pastes an image in ChatView, the renderer creates a content block in Anthropic Messages API shape:

{ type: 'image', source: { type: 'base64', media_type, data } }

ClaudeAgentRunner in src/main/claude/agent-runner.ts only forwarded the prompt text via piSession.prompt(string), so the image bytes were silently dropped before reaching the model. The pre-existing hasImages flag was set but never acted on — vision-capable agents received only the text.

This patch:

Walks the current user message's content blocks and extracts any type:'image' entries.
Normalises the Anthropic shape (source.media_type, source.data) to the pi-coding-agent shape ({type:'image', mimeType, data}). Already-normalised blocks are accepted too for forward compat.
Passes the extracted images via piSession.prompt(text, { images: currentTurnImages }). The text-only path is preserved when the user didn't paste anything, so the wire format is unchanged in the common case.

Only the current user turn is processed — prior turns are untouched. The existing hasImages log line now also reports the extracted count for easier debugging.

Test plan

Paste an image (PNG/JPEG) into ChatView, send with a question like "what's in this image?", confirm the model describes it instead of saying it can't see it.
Send a text-only message, confirm no behaviour change (no images key in the underlying request).
Send a message with multiple pasted images, confirm all are forwarded.
Confirm log line shows e.g. User message contains images: 2 extracted.

When a user pastes an image in ChatView, the renderer creates a content block of shape `{type:'image', source:{type:'base64', media_type, data}}` following the Anthropic Messages API. The agent runner only forwarded the prompt text via `piSession.prompt(string)`, so the image bytes were silently dropped before reaching the model. This change extracts image blocks from the current user message, normalises them to the pi-coding-agent shape (`{type:'image', mimeType, data}`), and passes them via the `{images: [...]}` option on `prompt()`. The text-only path is preserved so we don't change the wire format when no images are present. Notes: - Reads `source.media_type` / `source.data` (Anthropic shape) but also accepts already-normalised `mimeType` / `data` for forward compat. - Only the current user turn is processed; prior assistant/tool turns are unchanged. - The existing `hasImages` log line now reports the number of extracted images for easier debugging.

github-actions

Findings

No high-confidence issues found in the added/modified lines.

Summary

Review mode: initial. No correctness, security, or regression issues were identified in src/main/claude/agent-runner.ts on the modified lines.
Residual risk: this change adds a new image-forwarding path in src/main/claude/agent-runner.ts without focused regression coverage under tests/, so future SDK-shape drift may not be caught automatically.

Testing

Not run (automation). Suggested follow-up: add an agent-runner regression test that asserts image blocks are normalized to { type: 'image', mimeType, data } and passed to piSession.prompt(..., { images }).

Open Cowork Bot

hqhq1025 added bot-rerun Temporary label for rerunning bot automation and removed bot-rerun Temporary label for rerunning bot automation labels Apr 30, 2026

github-actions Bot reviewed Apr 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent-runner): forward pasted image content to vision-capable models#168

fix(agent-runner): forward pasted image content to vision-capable models#168
lxistired wants to merge 1 commit into
OpenCoworkAI:mainfrom
lxistired:feat/image-paste-fix

lxistired commented Apr 28, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lxistired commented Apr 28, 2026

Summary

Test plan

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants