fix: harden OpenAI-compatible Phase 1 routing#998
Conversation
- Route OpenAI-compatible custom endpoints through Chat Completions instead of Responses API - Treat NVIDIA-hosted minimax model names as MiniMax thinking-capable - Fall back cleanly when structured output returns None - Add regression coverage for routing, capabilities, and structured fallback
There was a problem hiding this comment.
Code Review
This pull request introduces several improvements to LLM client routing and capabilities. It ensures that the Responses API is only enabled for native OpenAI endpoints by parsing the base URL, handles None results from structured LLM outputs by raising a ValueError and falling back to free text, and updates capability matching to use pattern.search with support for prefixed MiniMax models. The review feedback suggests generalizing the DeepSeek capability patterns to support prefixed and case-insensitive names, making the base_url parsing more robust when a scheme is missing, and expanding the test suite to cover these hosted model capabilities.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| (re.compile(r"^deepseek-v\d"), _DEEPSEEK_THINKING), | ||
| (re.compile(r"^deepseek-reasoner"), _DEEPSEEK_THINKING), | ||
| (re.compile(r"^MiniMax-M\d"), _MINIMAX_THINKING), | ||
| (re.compile(r"(^|/)minimax-m\d", re.IGNORECASE), _MINIMAX_THINKING), |
There was a problem hiding this comment.
The current patterns for DeepSeek models (^deepseek-v\d and ^deepseek-reasoner) use the ^ anchor, which prevents them from matching prefixed model names (e.g., deepseek-ai/deepseek-v3 or nvidia/deepseek-r1) when hosted on third-party providers like OpenRouter or NVIDIA. Since pattern.search is now used, we can generalize these patterns to support optional prefixes using (^|/) and make them case-insensitive, similar to the MiniMax pattern. This also allows us to remove the redundant MiniMax-M\d pattern.
| (re.compile(r"^deepseek-v\d"), _DEEPSEEK_THINKING), | |
| (re.compile(r"^deepseek-reasoner"), _DEEPSEEK_THINKING), | |
| (re.compile(r"^MiniMax-M\d"), _MINIMAX_THINKING), | |
| (re.compile(r"(^|/)minimax-m\d", re.IGNORECASE), _MINIMAX_THINKING), | |
| (re.compile(r"(^|/)deepseek-v\d", re.IGNORECASE), _DEEPSEEK_THINKING), | |
| (re.compile(r"(^|/)deepseek-reasoner", re.IGNORECASE), _DEEPSEEK_THINKING), | |
| (re.compile(r"(^|/)minimax-m\d", re.IGNORECASE), _MINIMAX_THINKING), |
| if not base_url: | ||
| return True | ||
| host = urlparse(base_url).hostname or "" | ||
| return host == "api.openai.com" or host.endswith(".openai.com") |
There was a problem hiding this comment.
If base_url is configured without a scheme (e.g., api.openai.com/v1 or my-proxy.openai.com), urlparse treats the host as part of the path, resulting in an empty hostname. This would cause _is_native_openai_base_url to return False and incorrectly disable the Responses API for native OpenAI. Prepending a default scheme when :// is missing makes the parsing robust.
| if not base_url: | |
| return True | |
| host = urlparse(base_url).hostname or "" | |
| return host == "api.openai.com" or host.endswith(".openai.com") | |
| if not base_url: | |
| return True | |
| if "://" not in base_url: | |
| base_url = "https://" + base_url | |
| host = urlparse(base_url).hostname or "" | |
| return host == "api.openai.com" or host.endswith(".openai.com") |
| @pytest.mark.unit | ||
| class TestNvidiaHostedMinimaxCapabilities: | ||
| def test_nvidia_hosted_minimax_m2_7_uses_minimax_thinking_capabilities(self): | ||
| assert get_capabilities("minimaxai/minimax-m2.7") == _MINIMAX_THINKING |
There was a problem hiding this comment.
Rename the test class to a more general name and add test cases for hosted DeepSeek models to ensure the generalized capability pattern matching works correctly and prevents future regressions.
| @pytest.mark.unit | |
| class TestNvidiaHostedMinimaxCapabilities: | |
| def test_nvidia_hosted_minimax_m2_7_uses_minimax_thinking_capabilities(self): | |
| assert get_capabilities("minimaxai/minimax-m2.7") == _MINIMAX_THINKING | |
| @pytest.mark.unit | |
| class TestHostedModelCapabilities: | |
| def test_nvidia_hosted_minimax_m2_7_uses_minimax_thinking_capabilities(self): | |
| assert get_capabilities("minimaxai/minimax-m2.7") == _MINIMAX_THINKING | |
| def test_hosted_deepseek_uses_deepseek_thinking_capabilities(self): | |
| from tradingagents.llm_clients.capabilities import _DEEPSEEK_THINKING | |
| assert get_capabilities("deepseek-ai/deepseek-v3") == _DEEPSEEK_THINKING | |
| assert get_capabilities("nvidia/deepseek-r1") == _DEEPSEEK_THINKING |
|
Thanks @MaH-coding. This distinction — native OpenAI hosts on the Responses API, OpenAI-compatible endpoints on Chat Completions — is now handled by the provider registry that drives every OpenAI-compatible provider. Closing as superseded; appreciate flagging it. |
Summary
minimaxai/minimax-m2.7as MiniMax thinking-capable.None.Verification
tradingagents:phase1with network disabled and read-only source mount:targeted_checks=PASS.Notes
.env, or Docker override artifacts.