feat: 新增 console.x.ai 路由，让 basic 账号使用 grok-4.3/4.20 系列（默认开启 web search） by cloudriver8 · Pull Request #542 · chenyme/grok2api

cloudriver8 · 2026-05-17T15:10:00Z

Summary

将 6 个 console 系列模型（grok-4.3 / grok-4 / grok-4.20 / grok-4.20-reasoning / grok-4.20-non-reasoning / grok-4.20-multi-agent）通过 console.x.ai/v1/responses 路由，使免费 basic 账号即可调用——免账户充值。

主要功能 (feat)

新协议层 app/dataplane/reverse/protocol/xai_console.py
- 支持 Responses API 完整请求 / 响应格式
- 结构化 input 数组：多模态 (text + image_url/base64)、对话历史、function_call / function_call_output
- 原生 function calling（透传 OpenAI 兼容 tools / tool_choice，不再走 ToolSieve XML 注入）
- instructions 字段聚合 role=system 消息，改善推理模型表现
- 完整的 SSE 流式适配器 ConsoleStreamAdapter，处理 14+ 种上游事件类型
模型注册 app/control/model/registry.py + app/control/model/spec.py
- 新增 console_model 字段标识 console 路由模型
- 6 个新模型公开名映射到上游真实模型 ID
路由 / 端点
- app/dataplane/reverse/runtime/endpoint_table.py 新增 CONSOLE_RESPONSES
- app/dataplane/reverse/planner.py 将 spec.is_console() 模型路由到 console
- app/products/openai/chat.py 新增 _console_completions（流式 + 非流式），自动注入 web_search 工具，提取并注入 search_sources 到响应根字段
- app/products/openai/responses.py 新增 _console_responses_dispatch，直接透传上游 Responses 格式 + SSE 事件
- app/products/anthropic/messages.py 通过 Chat Completions bridge 复用，让 /v1/messages 也支持新模型
额度耗尽自动绕过 app/control/account/invalid_credentials.py + state_machine.py
- HTTP 402（trial credits 用完）映射为 FeedbackKind.RATE_LIMITED
- 账号池自动跳过额度耗尽的 token，自动切换到其它可用账号

修复 (fix)

multi-agent 信源提取：grok-4.20-multi-agent 上游不发 web_search_call items，仅以 message annotations 形式发布 URL（start_index == end_index == 0）。原本只从 web_search_call 提取，导致 multi-agent 的 search_sources 始终为空。新增 fallback 从 annotations 提取 URL，dedupe 后注入。
WebUI 模型下拉列表过滤：/webui/api/models 端点之前列出所有 enabled=True 的模型，包括账号池根本没有的 super/heavy tier，用户选了就报 No available accounts for this model tier。改为复用 /v1/models 同样的 _available_pools + _model_available_for_pools 过滤逻辑，两个端点输出保持一致。

不破坏现有行为

grok.com 路径（grok-4.20-fast / 0309-non-reasoning / imagine 等）零改动
ModelSpec 新增字段默认值 None，老模型不受影响
老模型 grok-4.20-multi-agent-0309 等保留在 registry 里（HEAVY tier），有 heavy 账号的用户仍可用
WebUI 过滤逻辑跟 /v1/models 同源，行为对齐

背景说明

触发原因：basic 账号无法使用 grok-4.3 / 4.20 reasoning 等高级模型，但 xAI 提供了 console.x.ai/v1/responses 入口用同一套 SSO cookie 可访问。
已知限制：xAI 服务端不返回 reasoning summary 明文（即使设置 reasoning.summary='detailed' + include=['reasoning.encrypted_content']），客户端只能看到 reasoning_tokens 计数。这是上游行为，类似 OpenAI o1。

Testing

自动化检查

python -m py_compile 全文件语法验证通过
python -m pyflakes 仅原有未使用 import 警告，新增代码无问题

实测部署（AWS EC2 / Debian 12 / 88 个 basic 账号）

调用 /v1/chat/completions 非流式：

curl -X POST http://127.0.0.1:8000/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"grok-4.3","messages":[{"role":"user","content":"Who is the current CEO of OpenAI? One sentence."}]}'

返回（截断）：

{
  "model": "grok-4.3",
  "choices": [{"message": {
    "content": "The current CEO of OpenAI is Sam Altman.[[1]](https://www.clay.com/dossier/openai-ceo)[[2]](https://en.wikipedia.org/wiki/Sam_Altman)",
    "annotations": [{"type": "url_citation", "url_citation": {"url": "https://www.clay.com/dossier/openai-ceo", "title": "1", "start_index": 40, "end_index": 86}}]
  }}],
  "usage": {"prompt_tokens": 3806, "completion_tokens": 1022, "reasoning_tokens": 493, "total_tokens": 4828},
  "search_sources": [{"url": "https://en.wikipedia.org/wiki/Sam_Altman"}]
}

多模型 / 多端点真实场景验证

模型	端点	流式	结果
`grok-4.3`	`/v1/chat/completions`	否	✅ 10 sources, reasoning=493
`grok-4.3`	`/v1/chat/completions`	是	✅ SSE 正常
`grok-4.3`	`/v1/responses`	否	✅ 透传 output array
`grok-4.3`	`/v1/messages`	是	✅ Anthropic SSE 事件流完整
`grok-4.20-non-reasoning`	chat	否	✅ 24 sources, reasoning=0
`grok-4.20-reasoning`	chat	否	✅ 5 sources, reasoning=417
`grok-4.20-multi-agent`	chat	否	✅ 5 sources (fallback 生效), reasoning=1609

WebUI 过滤验证

修复前：/webui/api/models 返回 30+ 个模型，包括 super/heavy tier。
修复后：与 /v1/models 输出对齐，仅返回 basic 账号能用的 9 个：

grok-4, grok-4.3, grok-4.20, grok-4.20-fast, grok-4.20-multi-agent,
grok-4.20-non-reasoning, grok-4.20-reasoning, grok-4.20-0309-non-reasoning,
grok-imagine-image-lite

… fallback Multi-agent console models (grok-4.20-multi-agent) skip web_search_call output items entirely and publish citation URLs only as document-level annotations on the assistant message with start_index == end_index == 0. The previous extractor relied solely on web_search_call items, leaving search_sources empty for these responses despite usage reporting real reasoning_tokens and visible citations. - extract_console_search_sources() now falls back to message annotations after exhausting web_search_call sources, deduping against URLs already collected so single-agent responses are unchanged. - ConsoleStreamAdapter mirrors the same fallback inside the SSE handler for response.output_text.annotation.added events. - Strip title when it duplicates the URL (common in multi-agent output). Verified end-to-end against grok-4.20-multi-agent on AWS deployment: sources count goes from 0 to a non-empty list while annotations and inline [[N]](url) citations remain intact.

The /webui/api/models endpoint listed every enabled ModelSpec regardless of which account pool the deployment actually has. Users with only basic accounts saw super/heavy-tier model names in the dropdown that immediately failed with 'No available accounts for this model tier' when selected. Reuse the same _available_pools + _model_available_for_pools filter that /v1/models already applies, so both endpoints stay in sync. Dropdown now shows only the subset of enabled models the configured account pool can actually serve. Verified on AWS deployment with 88 basic accounts: WebUI dropdown shrinks from ~30 entries to the 9 basic-tier-eligible models (console + grok.com basic + image), eliminating the unreachable-tier confusion.

Import PR chenyme#542 from chenyme/grok2api with console.x.ai Responses routing for grok-4.3 and grok-4.20 model variants, default web search support, account feedback handling for 402 responses, and WebUI model availability filtering.

imjcal · 2026-05-19T12:38:31Z

佬反馈个bug，你这个版本的新模型不受
全局附加指令
为每次请求注入统一的 system 消息，用于约束模型行为或固定角色设定。
的影响，失效了

- Add default_reasoning_effort field to ModelSpec - Set high for grok-4/grok-4.3/grok-4.20, leave others unset - Auto-fill default effort in chat.py and responses.py dispatch - Forward reasoning_effort param through router.py - Verified: reasoning_tokens 190->273 (+44%) without explicit effort

cloudriver8 · 2026-05-20T06:46:01Z

佬反馈个bug，你这个版本的新模型不受全局附加指令为每次请求注入统一的 system 消息，用于约束模型行为或固定角色设定。的影响，失效了

Not a code bug. [build_console_payload] grok2api/app/dataplane/reverse/protocol/xai_console.py:267:0-333:18) correctly reads from features.custom_instruction. The issue was that the value in config.toml was mistakenly placed under the [app] section instead of [features]. Verified working after moving it to the correct section.

cloudriver8 added 3 commits May 17, 2026 21:45

feat: add console.x.ai routing for grok-4.20+ models with web search

bddbad9

timi778 added a commit to timi778/grok2api that referenced this pull request May 18, 2026

Merge PR chenyme#542 from chenyme/grok2api

c24402a

highkay added a commit to highkay/grok2api that referenced this pull request May 19, 2026

Merge pull request chenyme#542 from chenyme/web-search-console-routing

7953ed8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: 新增 console.x.ai 路由，让 basic 账号使用 grok-4.3/4.20 系列（默认开启 web search）#542

feat: 新增 console.x.ai 路由，让 basic 账号使用 grok-4.3/4.20 系列（默认开启 web search）#542
cloudriver8 wants to merge 4 commits into
chenyme:mainfrom
cloudriver8:feat/console-x-ai-routing

cloudriver8 commented May 17, 2026

Uh oh!

imjcal commented May 19, 2026

Uh oh!

cloudriver8 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cloudriver8 commented May 17, 2026

Summary

主要功能 (feat)

修复 (fix)

不破坏现有行为

背景说明

Testing

自动化检查

实测部署（AWS EC2 / Debian 12 / 88 个 basic 账号）

多模型 / 多端点真实场景验证

WebUI 过滤验证

Related

Uh oh!

imjcal commented May 19, 2026

Uh oh!

cloudriver8 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants