Skip to content
This repository was archived by the owner on May 13, 2026. It is now read-only.
This repository was archived by the owner on May 13, 2026. It is now read-only.

[Feature Request] Auto-continue responses when DeepSeek hits context limit #494

@glebati-blip

Description

@glebati-blip

🥰 需求描述

Been running into this issue quite a bit lately, especially when generating long code files or asking for detailed analysis.

Basically, when you send a request through ds2api and DeepSeek hits its context limit while responding, the web UI normally shows a "Continue" button and lets you pick up where it left off. But ds2api doesn't handle this at all right now. The client just gets a truncated response, and that's it.

Here's a concrete example:

Someone asks DeepSeek to generate a 500-line Python script. It gets through maybe 400 lines, then stops with something like "Context limit reached, please continue." In the web interface you'd just click Continue and it finishes.

What currently happens:

  • Client gets incomplete response
  • No way to trigger continuation through the API
  • The conversation is effectively broken unless the client implements its own retry logic, which is messy

What should happen instead:

  • ds2api detects that the response got cut off
  • Automatically requests the continuation behind the scenes
  • Client receives the complete response without ever knowing anything went wrong

I've looked through the existing documentation and didn't see anything about handling this case. The current_input_file feature is great for pre-request context management, but it doesn't help when the limit gets hit mid-response.

Would be really nice to have this handled at the ds2api level so client apps don't need to implement their own workarounds.

🧐 解决方案

Here's what I'm thinking for the auto-continue implementation.

First, detection. The system should watch for signs that DeepSeek hit the context limit mid-response.

Second, the continuation logic. Once we detect a cut-off, ds2api should automatically request the next part using the same session ID. The web UI basically does this with a button click, so we'd just automate that step. Need to handle multiple continues because sometimes one isn't enough for really long outputs. Would also add a small delay between retries to avoid hammering the upstream.

Third, response merging. The client shouldn't see any of this happening. All chunks from the original response and any continuations should be merged into a single stream and sent out as if it was one complete response. This has to work across all protocols - OpenAI, Claude, Gemini - without breaking their expected event structures.

Configuration wise, I'd add a new block to config.json like this:

{
"auto_continue": {
"enabled": false,
"max_continues": 5,
"detection_timeout_ms": 30000,
"continue_delay_ms": 500
}
}

Disabled by default to keep current behavior unchanged. max_continues prevents infinite loops if something goes wrong. The timeout and delay settings give some control over how aggressive the retry logic is.

One thing to note - this is different from the existing current_input_file feature. That one handles context before the request by uploading history as a file. This new feature deals with context exhaustion that happens while DeepSeek is already generating. They actually complement each other pretty well.

For the technical side, the continuation request will need the same conversation context but minimal overhead. Should work with streaming and non-streaming responses. Would be good to log every auto-continue action so admins can debug if something acts weird. Also maybe add a header like X-Auto-Continued: true so clients know the response required automatic continuation if they care about that.

This would be really useful for long code generation, big analytical outputs, or any scenario where you tend to hit the context limit halfway through a response.

📝 补充信息

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions