fix: align /v1/chat/completions SSE stream with OpenAI spec#1262
Open
fix: align /v1/chat/completions SSE stream with OpenAI spec#1262
Conversation
- Add missing `data: [DONE]\n\n` terminator at end of stream; clients
that follow the SSE spec were hanging waiting for this sentinel
- Emit a role-only initial chunk `{role:"assistant",content:""}` once
per choice before any content, matching OpenAI / vLLM behaviour
- Remove `role:"assistant"` from all subsequent delta chunks (reasoning,
tool-call text, and plain content paths) so role appears only in the
first chunk as the spec requires
Contributor
There was a problem hiding this comment.
Code Review
This pull request updates the OpenAI API streaming implementation to comply with the SSE specification by emitting an initial role-only chunk and removing the role from subsequent deltas. It also adds a [DONE] terminator to the end of the stream. Feedback was provided to encode the terminator string to bytes to ensure consistency with the function's type hint and other parts of the codebase.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
data: [DONE]\n\nterminator at the end of/v1/chat/completionsstreams — clients following the SSE spec were hanging waiting for this sentinel (the/v1/completionsendpoint already emitted it, so this was an inconsistency within LightLLM itself)delta: {role: "assistant", content: ""}once per choice before any content, matching OpenAI API / vLLM behaviourrole: "assistant"from all subsequent delta chunks (plain content, tool-call text, and reasoning paths) soroleappears only in the first chunk as the spec requiresChunk sequence after this fix
Test Plan
role: "assistant"withcontent: ""roledata: [DONE]appears as the final line of the streamstream_options: {include_usage: true}— usage chunk precedes[DONE]