feat: add fallback_max_context_tokens config for context compression#7942
Conversation
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The value
128000is duplicated across several places (ChatProviderTemplate default, MainAgentBuildConfig default, settings.get fallback, and i18n hint text); consider centralizing this as a single constant or config default to avoid drift if it ever changes. - Changing
ChatProviderTemplate.max_context_tokensdefault from0to128000alters existing behavior beyond the described fallback path (e.g., for models with smaller context limits or where users relied on0meaning "no compression"); it may be safer to keep the default at0and only apply the fallback in the specificmax_context_tokens == 0 && model not in metadatabranch.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The value `128000` is duplicated across several places (ChatProviderTemplate default, MainAgentBuildConfig default, settings.get fallback, and i18n hint text); consider centralizing this as a single constant or config default to avoid drift if it ever changes.
- Changing `ChatProviderTemplate.max_context_tokens` default from `0` to `128000` alters existing behavior beyond the described fallback path (e.g., for models with smaller context limits or where users relied on `0` meaning "no compression"); it may be safer to keep the default at `0` and only apply the fallback in the specific `max_context_tokens == 0 && model not in metadata` branch.
## Individual Comments
### Comment 1
<location path="astrbot/core/astr_main_agent.py" line_range="1372-1374" />
<code_context>
provider.provider_config["max_context_tokens"] = model_info["limit"][
"context"
]
+ else:
+ # fallback: default to configured fallback value
+ provider.provider_config["max_context_tokens"] = config.fallback_max_context_tokens
if event.get_platform_name() == "webchat":
</code_context>
<issue_to_address>
**issue (bug_risk):** Fallback overrides any user-specified `max_context_tokens` when model metadata is missing.
According to the config metadata, the fallback should only apply "when `max_context_tokens` is 0 and the model is not in the built-in metadata". With this `else` branch, any time `model_info` is missing you overwrite `provider.provider_config["max_context_tokens"]`, even if the user explicitly set a non-zero value. To avoid surprising users, only apply the fallback when the current `max_context_tokens` is 0 or unset, e.g.:
```python
max_ctx = provider.provider_config.get("max_context_tokens")
if not model_info:
if not max_ctx:
provider.provider_config["max_context_tokens"] = config.fallback_max_context_tokens
else:
provider.provider_config["max_context_tokens"] = model_info["limit"]["context"]
```
</issue_to_address>
### Comment 2
<location path="astrbot/core/pipeline/process_stage/method/agent_sub_stages/internal.py" line_range="110-111" />
<code_context>
)
if self.dequeue_context_length <= 0:
self.dequeue_context_length = 1
+ self.fallback_max_context_tokens: int = settings.get(
+ "fallback_max_context_tokens", 128000
+ )
</code_context>
<issue_to_address>
**suggestion:** The fallback context default is duplicated here instead of sharing MainAgentBuildConfig’s default.
`128000` is now hard-coded in several places (ChatProviderTemplate, `MainAgentBuildConfig.fallback_max_context_tokens`, and here), which risks them diverging if the default changes. Consider sourcing this value from `MainAgentBuildConfig` or a shared constant so configuration, behavior, and UI stay consistent.
Suggested implementation:
```python
if self.dequeue_context_length <= 0:
self.dequeue_context_length = 1
# Share the same default as MainAgentBuildConfig to keep UI/config/behavior in sync
from astrbot.core.config.main_agent_build import MainAgentBuildConfig
self.fallback_max_context_tokens: int = settings.get(
"fallback_max_context_tokens",
MainAgentBuildConfig().fallback_max_context_tokens,
)
self.llm_safety_mode = settings.get("llm_safety_mode", True)
```
- If `MainAgentBuildConfig` requires constructor arguments (e.g., settings or other config), replace `MainAgentBuildConfig()` with the appropriate initialization used elsewhere in the codebase (for example, `MainAgentBuildConfig.from_settings(settings)` or similar) and keep using its `.fallback_max_context_tokens` default as the fallback.
- If there is already a shared constant or class-level attribute for this default (e.g., `MainAgentBuildConfig.DEFAULT_FALLBACK_MAX_CONTEXT_TOKENS`), prefer using that instead of instantiating the class:
```python
from astrbot.core.config.main_agent_build import MainAgentBuildConfig
self.fallback_max_context_tokens: int = settings.get(
"fallback_max_context_tokens",
MainAgentBuildConfig.DEFAULT_FALLBACK_MAX_CONTEXT_TOKENS,
)
```
- Ensure any other locations that still hard-code `128000` (e.g., `ChatProviderTemplate`) are updated to reference the same shared default to fully de-duplicate the value.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| else: | ||
| # fallback: default to configured fallback value | ||
| provider.provider_config["max_context_tokens"] = config.fallback_max_context_tokens |
There was a problem hiding this comment.
issue (bug_risk): Fallback overrides any user-specified max_context_tokens when model metadata is missing.
According to the config metadata, the fallback should only apply "when max_context_tokens is 0 and the model is not in the built-in metadata". With this else branch, any time model_info is missing you overwrite provider.provider_config["max_context_tokens"], even if the user explicitly set a non-zero value. To avoid surprising users, only apply the fallback when the current max_context_tokens is 0 or unset, e.g.:
max_ctx = provider.provider_config.get("max_context_tokens")
if not model_info:
if not max_ctx:
provider.provider_config["max_context_tokens"] = config.fallback_max_context_tokens
else:
provider.provider_config["max_context_tokens"] = model_info["limit"]["context"]There was a problem hiding this comment.
Code Review
This pull request implements a fallback mechanism for the LLM context window size when model metadata is unavailable, updating the core logic, configuration schemas, and localization files. Feedback indicates that changing the default max_context_tokens to 128000 breaks the automatic limit detection logic, which relies on the value being 0. Additionally, several cua_ related configuration entries were accidentally deleted from the localization files and need to be restored to maintain dashboard functionality.
I am having trouble creating individual review comments. Click here to see my feedback.
astrbot/core/config/default.py (312)
将 max_context_tokens 的默认值从 0 修改为 128000 会导致模型的自动限制检测逻辑失效。
在 astrbot/core/astr_main_agent.py 第 1366 行,系统通过检查 max_context_tokens <= 0 来决定是否从 LLM_METADATAS 自动获取模型上下文限制。如果默认值设为 128000,新添加的提供商将不再触发自动检测,即使是已知限制的模型(如 GPT 系列)也会被强制设为 128k,这可能导致上下文超出模型实际承受范围。
建议保持默认值为 0,兜底逻辑应由新增的 fallback_max_context_tokens 处理。
"max_context_tokens": 0,
dashboard/src/i18n/locales/en-US/features/config-metadata.json (189-212)
在修改过程中意外删除了与 CUA 相关的配置元数据(cua_image、cua_os_type、cua_ttl 等)。这些配置项在 astrbot/core/config/default.py 中仍被引用,删除它们会导致管理面板中对应的配置界面无法正常显示。请恢复这些配置项。
dashboard/src/i18n/locales/ru-RU/features/config-metadata.json (189-212)
在修改过程中意外删除了与 CUA 相关的配置元数据(cua_image、cua_os_type、cua_ttl 等)。这些配置项在 astrbot/core/config/default.py 中仍被引用,删除它们会导致管理面板中对应的配置界面无法正常显示。请恢复这些配置项。
dashboard/src/i18n/locales/zh-CN/features/config-metadata.json (191-214)
在修改过程中意外删除了与 CUA 相关的配置元数据(cua_image、cua_os_type、cua_ttl 等)。这些配置项在 astrbot/core/config/default.py 中仍被引用,删除它们会导致管理面板中对应的配置界面无法正常显示或缺失翻译。请恢复这些配置项。
1377e86 to
07a69ac
Compare
- New config item fallback_max_context_tokens (default 128k) - When max_context_tokens is 0 and model not in LLM_METADATAS, use fallback_max_context_tokens as the context window limit - Unified global config under provider_settings, in truncate_and_compress section - i18n: zh-CN, en-US, ru-RU
07a69ac to
be3a7ad
Compare
改动内容
新增全局配置项
fallback_max_context_tokens,为上下文压缩提供兜底策略。背景
当
max_context_tokens为 0 且模型不在LLM_METADATAS中时,之前没有兜底,上下文压缩不会触发。现在使用fallback_max_context_tokens(默认 128000)作为兜底值。改动列表
provider_settings.fallback_max_context_tokens,类型int,默认128000行为变化
max_context_tokens = 0,模型在 metadata 中max_context_tokens = 0,模型不在 metadata 中fallback_max_context_tokens(128k)Summary by Sourcery
Introduce a configurable fallback context window size for local chat providers and use it when model metadata is unavailable.
New Features:
Enhancements:
Documentation: