feat: 新增 console.x.ai 路由,让 basic 账号使用 grok-4.3/4.20 系列(默认开启 web search)#542
Open
cloudriver8 wants to merge 4 commits into
Open
feat: 新增 console.x.ai 路由,让 basic 账号使用 grok-4.3/4.20 系列(默认开启 web search)#542cloudriver8 wants to merge 4 commits into
cloudriver8 wants to merge 4 commits into
Conversation
… fallback Multi-agent console models (grok-4.20-multi-agent) skip web_search_call output items entirely and publish citation URLs only as document-level annotations on the assistant message with start_index == end_index == 0. The previous extractor relied solely on web_search_call items, leaving search_sources empty for these responses despite usage reporting real reasoning_tokens and visible citations. - extract_console_search_sources() now falls back to message annotations after exhausting web_search_call sources, deduping against URLs already collected so single-agent responses are unchanged. - ConsoleStreamAdapter mirrors the same fallback inside the SSE handler for response.output_text.annotation.added events. - Strip title when it duplicates the URL (common in multi-agent output). Verified end-to-end against grok-4.20-multi-agent on AWS deployment: sources count goes from 0 to a non-empty list while annotations and inline [[N]](url) citations remain intact.
The /webui/api/models endpoint listed every enabled ModelSpec regardless of which account pool the deployment actually has. Users with only basic accounts saw super/heavy-tier model names in the dropdown that immediately failed with 'No available accounts for this model tier' when selected. Reuse the same _available_pools + _model_available_for_pools filter that /v1/models already applies, so both endpoints stay in sync. Dropdown now shows only the subset of enabled models the configured account pool can actually serve. Verified on AWS deployment with 88 basic accounts: WebUI dropdown shrinks from ~30 entries to the 9 basic-tier-eligible models (console + grok.com basic + image), eliminating the unreachable-tier confusion.
timi778
added a commit
to joyce677/grok2api
that referenced
this pull request
May 18, 2026
Import PR chenyme#542 from chenyme/grok2api with console.x.ai Responses routing for grok-4.3 and grok-4.20 model variants, default web search support, account feedback handling for 402 responses, and WebUI model availability filtering.
timi778
added a commit
to timi778/grok2api
that referenced
this pull request
May 18, 2026
highkay
added a commit
to highkay/grok2api
that referenced
this pull request
May 19, 2026
|
佬反馈个bug,你这个版本的新模型不受 |
- Add default_reasoning_effort field to ModelSpec - Set high for grok-4/grok-4.3/grok-4.20, leave others unset - Auto-fill default effort in chat.py and responses.py dispatch - Forward reasoning_effort param through router.py - Verified: reasoning_tokens 190->273 (+44%) without explicit effort
Author
Not a code bug. [build_console_payload] grok2api/app/dataplane/reverse/protocol/xai_console.py:267:0-333:18) correctly reads from |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
将 6 个 console 系列模型(
grok-4.3/grok-4/grok-4.20/grok-4.20-reasoning/grok-4.20-non-reasoning/grok-4.20-multi-agent)通过console.x.ai/v1/responses路由,使免费 basic 账号即可调用——免账户充值。主要功能 (feat)
新协议层
app/dataplane/reverse/protocol/xai_console.pyinput数组:多模态 (text + image_url/base64)、对话历史、function_call/function_call_outputtools/tool_choice,不再走 ToolSieve XML 注入)instructions字段聚合role=system消息,改善推理模型表现ConsoleStreamAdapter,处理 14+ 种上游事件类型模型注册
app/control/model/registry.py+app/control/model/spec.pyconsole_model字段标识 console 路由模型路由 / 端点
app/dataplane/reverse/runtime/endpoint_table.py新增CONSOLE_RESPONSESapp/dataplane/reverse/planner.py将spec.is_console()模型路由到 consoleapp/products/openai/chat.py新增_console_completions(流式 + 非流式),自动注入web_search工具,提取并注入search_sources到响应根字段app/products/openai/responses.py新增_console_responses_dispatch,直接透传上游 Responses 格式 + SSE 事件app/products/anthropic/messages.py通过 Chat Completions bridge 复用,让/v1/messages也支持新模型额度耗尽自动绕过
app/control/account/invalid_credentials.py+state_machine.pyFeedbackKind.RATE_LIMITED修复 (fix)
multi-agent 信源提取:
grok-4.20-multi-agent上游不发web_search_callitems,仅以 message annotations 形式发布 URL(start_index == end_index == 0)。原本只从web_search_call提取,导致 multi-agent 的search_sources始终为空。新增 fallback 从 annotations 提取 URL,dedupe 后注入。WebUI 模型下拉列表过滤:
/webui/api/models端点之前列出所有enabled=True的模型,包括账号池根本没有的 super/heavy tier,用户选了就报No available accounts for this model tier。改为复用/v1/models同样的_available_pools+_model_available_for_pools过滤逻辑,两个端点输出保持一致。不破坏现有行为
ModelSpec新增字段默认值None,老模型不受影响grok-4.20-multi-agent-0309等保留在 registry 里(HEAVY tier),有 heavy 账号的用户仍可用/v1/models同源,行为对齐背景说明
console.x.ai/v1/responses入口用同一套 SSO cookie 可访问。reasoning.summary='detailed'+include=['reasoning.encrypted_content']),客户端只能看到reasoning_tokens计数。这是上游行为,类似 OpenAI o1。Testing
自动化检查
python -m py_compile全文件语法验证通过python -m pyflakes仅原有未使用 import 警告,新增代码无问题实测部署(AWS EC2 / Debian 12 / 88 个 basic 账号)
调用
/v1/chat/completions非流式:返回(截断):
{ "model": "grok-4.3", "choices": [{"message": { "content": "The current CEO of OpenAI is Sam Altman.[[1]](https://www.clay.com/dossier/openai-ceo)[[2]](https://en.wikipedia.org/wiki/Sam_Altman)", "annotations": [{"type": "url_citation", "url_citation": {"url": "https://www.clay.com/dossier/openai-ceo", "title": "1", "start_index": 40, "end_index": 86}}] }}], "usage": {"prompt_tokens": 3806, "completion_tokens": 1022, "reasoning_tokens": 493, "total_tokens": 4828}, "search_sources": [{"url": "https://en.wikipedia.org/wiki/Sam_Altman"}] }多模型 / 多端点真实场景验证
grok-4.3/v1/chat/completionsgrok-4.3/v1/chat/completionsgrok-4.3/v1/responsesgrok-4.3/v1/messagesgrok-4.20-non-reasoninggrok-4.20-reasoninggrok-4.20-multi-agentWebUI 过滤验证
修复前:
/webui/api/models返回 30+ 个模型,包括 super/heavy tier。修复后:与
/v1/models输出对齐,仅返回 basic 账号能用的 9 个:Related