Skip to content

Fallback to sync invocation if streaming fails#135

Open
hubertzub-db wants to merge 5 commits intodatabricks:mainfrom
hubertzub-db:fix-stream-fallback
Open

Fallback to sync invocation if streaming fails#135
hubertzub-db wants to merge 5 commits intodatabricks:mainfrom
hubertzub-db:fix-stream-fallback

Conversation

@hubertzub-db
Copy link
Contributor

@hubertzub-db hubertzub-db commented Feb 25, 2026

Summary

  • When the upstream streamText call encounters an error mid-stream, the chat server now transparently falls back to a non-streaming generateText call so the user still gets a response instead of just an error message.
  • I had to replace the previous writer.merge() approach with a manual drainStreamToWriter helper that reads chunks individually, detects errors, and triggers the fallback path when needed.

Demo/test (simulated error in agent's stream_handler)

stream-fallback.mov

Regression test (streaming still works)

stream-regression.mov

Demo of complete failure (both stream_handler and invoke_handler in agent are failing)

stream-complete-failure.mov

Test plan

  • Verify normal streaming still works end-to-end (no regression)
  • Simulate a streaming failure (e.g. upstream timeout or model error) and confirm the fallback fires and the user receives a complete response
  • Confirm error logging appears in the server console when fallback is triggered
  • Verify that if both streaming and fallback fail, the client receives a data-error message

@hubertzub-db hubertzub-db changed the title nice1 Fix streaming fallback Feb 25, 2026
@hubertzub-db hubertzub-db changed the title Fix streaming fallback Fallback to sync invocation if streaming fails Feb 25, 2026
* Reads all chunks from a UI message stream, forwarding non-error parts to the
* writer. Returns whether the stream encountered any errors.
*/
async function drainStreamToWriter(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: unfortunately we have to merge the streams manually, otherwise we won't have enough control to transparently switch over to fallback :/

* Reads all chunks from a UI message stream, forwarding non-error parts to the
* writer. Returns whether the stream encountered any errors.
*/
async function drainStreamToWriter(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry i think i wasn't clear, but we only want to try the non-streaming path if the streaming path never starts -- for errors that are within the stream itself, we do not want to falback right? this could result in duplicate content

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ex. when there is a network disconnect, we shouldn't be retrying the synchronous path, we should be trying to resume the stream.

pr looking at how the stream resumption works for context: #121

we could maybe? look at whether or not any content was emitted (ex. there was never a SSE w/ content in it?)

Copy link
Contributor Author

@hubertzub-db hubertzub-db Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry i think i wasn't clear, but we only want to try the non-streaming path if the streaming path never starts

If the streaming never starts because agent's streaming handler crashes at the beginning, then the chunk.value.type in middleware's drainStreamToWriter loop will have be set to error. That's due to agent framework/aisdk detecting a chunked stream even if the agent's stream function was never registered. In other words, even if agent<->middleware streaming completely fails at the very start, then the stream is still detected and passes initial messages (handshakes etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm since that's not trivial, maybe let's discuss this offline

Copy link
Contributor

@bbqiu bbqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for working on this! i wasn't clear in specifying what type of fallback we wanted, but if it's possible to only fallback when the stream doesn't start in the first place, that would be very helpful

also, once we make the change, if we could write a playwright integration test for this, that would be super helpful

@bbqiu bbqiu self-requested a review February 26, 2026 21:09
@hubertzub-db
Copy link
Contributor Author

@bbqiu I've

Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
Signed-off-by: Hubert Zub <hubert.zub@databricks.com>
@hubertzub-db hubertzub-db force-pushed the fix-stream-fallback branch from 9b9b671 to 940cdb3 Compare March 5, 2026 12:18
@hubertzub-db hubertzub-db requested review from bbqiu and removed request for bbqiu March 5, 2026 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants