Skip to content

Conversation

@AyushInKC
Copy link

Summary

This pull request fixes the SSE streaming implementation in the Cohere Java SDK. The current SSEIterator in Stream.java collects multiple data: lines into a single buffer and then attempts to parse the entire block as one JSON object. This does not match how Cohere’s SSE responses are structured, where each data: line is its own JSON event. As a result, the existing logic leads to JSON parsing errors and type conversion failures during streaming.

This PR updates the parsing logic to correctly interpret SSE events and restore functional streaming behavior.

Root Cause

Cohere’s streaming API uses standard Server-Sent Events format:

event: message-start
data: { ... }

event: content-delta
data: { ... }

The issue arises because the current implementation:

Appends all data: lines into a shared buffer.

Attempts to parse the entire buffer as JSON when a blank line appears.

Fails for several reasons:

Multiple data: events do not form a valid combined JSON object.

event: lines are not JSON and should not be parsed.

content-delta events are emitted incrementally and must be handled individually.

This incorrect aggregation leads to parse errors and prevents proper incremental streaming.

Fix Implemented

This pull request introduces the following corrections:

Only lines beginning with data: are parsed as individual JSON events.

event: lines and other non-JSON lines are skipped entirely.

Each data: line is decoded and delivered to the consumer immediately.

Additional null checks and type-safety guards eliminate ClassCastException and similar issues.

The updated flow follows SSE semantics and Cohere’s documented streaming behavior.

These changes ensure correct handling of message-start, content-start, and content-delta events, and allow incremental output to function as expected.

Tests

The streaming tests in StreamTest.java have been added or updated to verify:

Correct parsing of message-start and content-delta events.

That only data: lines are parsed as JSON.

That no parsing exceptions occur during streaming.

That incremental output is assembled correctly from multiple content-delta chunks.

All tests pass:

./gradlew test --tests com.cohere.api.StreamTest

Impact

With this fix, SSE streaming in the Java SDK behaves correctly again.
Applications that depend on token-by-token or incremental output will now receive updates reliably without runtime parsing errors. The changes do not affect the public API and remain fully backward-compatible.

Closing

This pull request addresses and resolves Issue #48 by correcting the SSE parsing logic and aligning the SDK with standard SSE behavior and Cohere’s streaming protocol. I am happy to make any adjustments requested during review.

@AyushInKC AyushInKC requested a review from a team as a code owner December 12, 2025 07:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant