Skip to content

feat: add fallback_max_context_tokens config for context compression#7942

Merged
Soulter merged 1 commit intoAstrBotDevs:masterfrom
Pleiades1726:feat/fallback-max-context-tokens
May 1, 2026
Merged

feat: add fallback_max_context_tokens config for context compression#7942
Soulter merged 1 commit intoAstrBotDevs:masterfrom
Pleiades1726:feat/fallback-max-context-tokens

Conversation

@Pleiades1726
Copy link
Copy Markdown
Contributor

@Pleiades1726 Pleiades1726 commented May 1, 2026

改动内容

新增全局配置项 fallback_max_context_tokens,为上下文压缩提供兜底策略。

背景

max_context_tokens 为 0 且模型不在 LLM_METADATAS 中时,之前没有兜底,上下文压缩不会触发。现在使用 fallback_max_context_tokens(默认 128000)作为兜底值。

改动列表

  • 新配置项: provider_settings.fallback_max_context_tokens,类型 int,默认 128000
  • 配置位置: AI 配置 → 上下文管理策略 → 上下文窗口兜底值
  • i18n: zh-CN, en-US, ru-RU 均已同步

行为变化

情况 之前 之后
max_context_tokens = 0,模型在 metadata 中 自动取 model limit 不变
max_context_tokens = 0,模型不在 metadata 中 压缩不触发 使用 fallback_max_context_tokens(128k)
默认值 0(压缩不触发) 128000

Summary by Sourcery

Introduce a configurable fallback context window size for local chat providers and use it when model metadata is unavailable.

New Features:

  • Add global provider_settings.fallback_max_context_tokens configuration for context compression with a default of 128000 tokens.

Enhancements:

  • Change the default max_context_tokens from 0 to 128000 so context compression can trigger by default when limits are unknown.
  • Propagate fallback_max_context_tokens through the main agent build and internal pipeline stages to ensure a consistent fallback behavior across providers.

Documentation:

  • Update configuration metadata and i18n strings to surface the new fallback context window option in the dashboard UI.

@auto-assign auto-assign Bot requested review from Soulter and anka-afk May 1, 2026 11:24
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels May 1, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The value 128000 is duplicated across several places (ChatProviderTemplate default, MainAgentBuildConfig default, settings.get fallback, and i18n hint text); consider centralizing this as a single constant or config default to avoid drift if it ever changes.
  • Changing ChatProviderTemplate.max_context_tokens default from 0 to 128000 alters existing behavior beyond the described fallback path (e.g., for models with smaller context limits or where users relied on 0 meaning "no compression"); it may be safer to keep the default at 0 and only apply the fallback in the specific max_context_tokens == 0 && model not in metadata branch.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The value `128000` is duplicated across several places (ChatProviderTemplate default, MainAgentBuildConfig default, settings.get fallback, and i18n hint text); consider centralizing this as a single constant or config default to avoid drift if it ever changes.
- Changing `ChatProviderTemplate.max_context_tokens` default from `0` to `128000` alters existing behavior beyond the described fallback path (e.g., for models with smaller context limits or where users relied on `0` meaning "no compression"); it may be safer to keep the default at `0` and only apply the fallback in the specific `max_context_tokens == 0 && model not in metadata` branch.

## Individual Comments

### Comment 1
<location path="astrbot/core/astr_main_agent.py" line_range="1372-1374" />
<code_context>
             provider.provider_config["max_context_tokens"] = model_info["limit"][
                 "context"
             ]
+        else:
+            # fallback: default to configured fallback value
+            provider.provider_config["max_context_tokens"] = config.fallback_max_context_tokens

     if event.get_platform_name() == "webchat":
</code_context>
<issue_to_address>
**issue (bug_risk):** Fallback overrides any user-specified `max_context_tokens` when model metadata is missing.

According to the config metadata, the fallback should only apply "when `max_context_tokens` is 0 and the model is not in the built-in metadata". With this `else` branch, any time `model_info` is missing you overwrite `provider.provider_config["max_context_tokens"]`, even if the user explicitly set a non-zero value. To avoid surprising users, only apply the fallback when the current `max_context_tokens` is 0 or unset, e.g.:

```python
max_ctx = provider.provider_config.get("max_context_tokens")
if not model_info:
    if not max_ctx:
        provider.provider_config["max_context_tokens"] = config.fallback_max_context_tokens
else:
    provider.provider_config["max_context_tokens"] = model_info["limit"]["context"]
```
</issue_to_address>

### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/internal.py" line_range="110-111" />
<code_context>
         )
         if self.dequeue_context_length <= 0:
             self.dequeue_context_length = 1
+        self.fallback_max_context_tokens: int = settings.get(
+            "fallback_max_context_tokens", 128000
+        )

</code_context>
<issue_to_address>
**suggestion:** The fallback context default is duplicated here instead of sharing MainAgentBuildConfig’s default.

`128000` is now hard-coded in several places (ChatProviderTemplate, `MainAgentBuildConfig.fallback_max_context_tokens`, and here), which risks them diverging if the default changes. Consider sourcing this value from `MainAgentBuildConfig` or a shared constant so configuration, behavior, and UI stay consistent.

Suggested implementation:

```python
        if self.dequeue_context_length <= 0:
            self.dequeue_context_length = 1

        # Share the same default as MainAgentBuildConfig to keep UI/config/behavior in sync
        from astrbot.core.config.main_agent_build import MainAgentBuildConfig

        self.fallback_max_context_tokens: int = settings.get(
            "fallback_max_context_tokens",
            MainAgentBuildConfig().fallback_max_context_tokens,
        )

        self.llm_safety_mode = settings.get("llm_safety_mode", True)

```

- If `MainAgentBuildConfig` requires constructor arguments (e.g., settings or other config), replace `MainAgentBuildConfig()` with the appropriate initialization used elsewhere in the codebase (for example, `MainAgentBuildConfig.from_settings(settings)` or similar) and keep using its `.fallback_max_context_tokens` default as the fallback.
- If there is already a shared constant or class-level attribute for this default (e.g., `MainAgentBuildConfig.DEFAULT_FALLBACK_MAX_CONTEXT_TOKENS`), prefer using that instead of instantiating the class:

  ```python
  from astrbot.core.config.main_agent_build import MainAgentBuildConfig

  self.fallback_max_context_tokens: int = settings.get(
      "fallback_max_context_tokens",
      MainAgentBuildConfig.DEFAULT_FALLBACK_MAX_CONTEXT_TOKENS,
  )
  ```

- Ensure any other locations that still hard-code `128000` (e.g., `ChatProviderTemplate`) are updated to reference the same shared default to fully de-duplicate the value.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/core/astr_main_agent.py Outdated
Comment on lines +1372 to +1374
else:
# fallback: default to configured fallback value
provider.provider_config["max_context_tokens"] = config.fallback_max_context_tokens
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Fallback overrides any user-specified max_context_tokens when model metadata is missing.

According to the config metadata, the fallback should only apply "when max_context_tokens is 0 and the model is not in the built-in metadata". With this else branch, any time model_info is missing you overwrite provider.provider_config["max_context_tokens"], even if the user explicitly set a non-zero value. To avoid surprising users, only apply the fallback when the current max_context_tokens is 0 or unset, e.g.:

max_ctx = provider.provider_config.get("max_context_tokens")
if not model_info:
    if not max_ctx:
        provider.provider_config["max_context_tokens"] = config.fallback_max_context_tokens
else:
    provider.provider_config["max_context_tokens"] = model_info["limit"]["context"]

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a fallback mechanism for the LLM context window size when model metadata is unavailable, updating the core logic, configuration schemas, and localization files. Feedback indicates that changing the default max_context_tokens to 128000 breaks the automatic limit detection logic, which relies on the value being 0. Additionally, several cua_ related configuration entries were accidentally deleted from the localization files and need to be restored to maintain dashboard functionality.

I am having trouble creating individual review comments. Click here to see my feedback.

astrbot/core/config/default.py (312)

high

max_context_tokens 的默认值从 0 修改为 128000 会导致模型的自动限制检测逻辑失效。

astrbot/core/astr_main_agent.py 第 1366 行,系统通过检查 max_context_tokens <= 0 来决定是否从 LLM_METADATAS 自动获取模型上下文限制。如果默认值设为 128000,新添加的提供商将不再触发自动检测,即使是已知限制的模型(如 GPT 系列)也会被强制设为 128k,这可能导致上下文超出模型实际承受范围。

建议保持默认值为 0,兜底逻辑应由新增的 fallback_max_context_tokens 处理。

    "max_context_tokens": 0,

dashboard/src/i18n/locales/en-US/features/config-metadata.json (189-212)

high

在修改过程中意外删除了与 CUA 相关的配置元数据(cua_imagecua_os_typecua_ttl 等)。这些配置项在 astrbot/core/config/default.py 中仍被引用,删除它们会导致管理面板中对应的配置界面无法正常显示。请恢复这些配置项。

dashboard/src/i18n/locales/ru-RU/features/config-metadata.json (189-212)

high

在修改过程中意外删除了与 CUA 相关的配置元数据(cua_imagecua_os_typecua_ttl 等)。这些配置项在 astrbot/core/config/default.py 中仍被引用,删除它们会导致管理面板中对应的配置界面无法正常显示。请恢复这些配置项。

dashboard/src/i18n/locales/zh-CN/features/config-metadata.json (191-214)

high

在修改过程中意外删除了与 CUA 相关的配置元数据(cua_imagecua_os_typecua_ttl 等)。这些配置项在 astrbot/core/config/default.py 中仍被引用,删除它们会导致管理面板中对应的配置界面无法正常显示或缺失翻译。请恢复这些配置项。

@Pleiades1726 Pleiades1726 force-pushed the feat/fallback-max-context-tokens branch from 1377e86 to 07a69ac Compare May 1, 2026 12:00
- New config item fallback_max_context_tokens (default 128k)
- When max_context_tokens is 0 and model not in LLM_METADATAS,
  use fallback_max_context_tokens as the context window limit
- Unified global config under provider_settings, in truncate_and_compress section
- i18n: zh-CN, en-US, ru-RU
@Pleiades1726 Pleiades1726 force-pushed the feat/fallback-max-context-tokens branch from 07a69ac to be3a7ad Compare May 1, 2026 12:10
@Soulter Soulter merged commit aa0b7a2 into AstrBotDevs:master May 1, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants