Skip to content

fix: harden OpenAI-compatible Phase 1 routing#998

Closed
MaH-coding wants to merge 2 commits into
TauricResearch:mainfrom
MaH-coding:fix/openrouter-phase1-hardening
Closed

fix: harden OpenAI-compatible Phase 1 routing#998
MaH-coding wants to merge 2 commits into
TauricResearch:mainfrom
MaH-coding:fix/openrouter-phase1-hardening

Conversation

@MaH-coding

Copy link
Copy Markdown

Summary

  • Route native OpenAI hosts to the Responses API while keeping OpenAI-compatible custom endpoints on Chat Completions.
  • Recognize NVIDIA-hosted MiniMax model IDs such as minimaxai/minimax-m2.7 as MiniMax thinking-capable.
  • Harden structured-output fallback when a structured invocation returns None.
  • Add regression coverage for endpoint routing, MiniMax capability matching, and structured fallback.

Verification

  • Targeted Docker verification passed inside tradingagents:phase1 with network disabled and read-only source mount: targeted_checks=PASS.
  • Pre-push privacy scan passed: no hardcoded credential assignments, no OpenRouter/NVIDIA/OpenAI key-shaped secrets, no personal paths/emails, no local artifacts staged.
  • Independent code review passed with no blocking security concerns or logic errors.

Notes

  • This PR does not include local runtime smoke logs, cache files, .env, or Docker override artifacts.
  • Phase 1 OpenRouter smoke results were treated as capability evaluation evidence only, not investment conclusions.

- Route OpenAI-compatible custom endpoints through Chat Completions instead of Responses API
- Treat NVIDIA-hosted minimax model names as MiniMax thinking-capable
- Fall back cleanly when structured output returns None
- Add regression coverage for routing, capabilities, and structured fallback

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several improvements to LLM client routing and capabilities. It ensures that the Responses API is only enabled for native OpenAI endpoints by parsing the base URL, handles None results from structured LLM outputs by raising a ValueError and falling back to free text, and updates capability matching to use pattern.search with support for prefixed MiniMax models. The review feedback suggests generalizing the DeepSeek capability patterns to support prefixed and case-insensitive names, making the base_url parsing more robust when a scheme is missing, and expanding the test suite to cover these hosted model capabilities.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +114 to +117
(re.compile(r"^deepseek-v\d"), _DEEPSEEK_THINKING),
(re.compile(r"^deepseek-reasoner"), _DEEPSEEK_THINKING),
(re.compile(r"^MiniMax-M\d"), _MINIMAX_THINKING),
(re.compile(r"(^|/)minimax-m\d", re.IGNORECASE), _MINIMAX_THINKING),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current patterns for DeepSeek models (^deepseek-v\d and ^deepseek-reasoner) use the ^ anchor, which prevents them from matching prefixed model names (e.g., deepseek-ai/deepseek-v3 or nvidia/deepseek-r1) when hosted on third-party providers like OpenRouter or NVIDIA. Since pattern.search is now used, we can generalize these patterns to support optional prefixes using (^|/) and make them case-insensitive, similar to the MiniMax pattern. This also allows us to remove the redundant MiniMax-M\d pattern.

Suggested change
(re.compile(r"^deepseek-v\d"), _DEEPSEEK_THINKING),
(re.compile(r"^deepseek-reasoner"), _DEEPSEEK_THINKING),
(re.compile(r"^MiniMax-M\d"), _MINIMAX_THINKING),
(re.compile(r"(^|/)minimax-m\d", re.IGNORECASE), _MINIMAX_THINKING),
(re.compile(r"(^|/)deepseek-v\d", re.IGNORECASE), _DEEPSEEK_THINKING),
(re.compile(r"(^|/)deepseek-reasoner", re.IGNORECASE), _DEEPSEEK_THINKING),
(re.compile(r"(^|/)minimax-m\d", re.IGNORECASE), _MINIMAX_THINKING),

Comment on lines +195 to +198
if not base_url:
return True
host = urlparse(base_url).hostname or ""
return host == "api.openai.com" or host.endswith(".openai.com")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If base_url is configured without a scheme (e.g., api.openai.com/v1 or my-proxy.openai.com), urlparse treats the host as part of the path, resulting in an empty hostname. This would cause _is_native_openai_base_url to return False and incorrectly disable the Responses API for native OpenAI. Prepending a default scheme when :// is missing makes the parsing robust.

Suggested change
if not base_url:
return True
host = urlparse(base_url).hostname or ""
return host == "api.openai.com" or host.endswith(".openai.com")
if not base_url:
return True
if "://" not in base_url:
base_url = "https://" + base_url
host = urlparse(base_url).hostname or ""
return host == "api.openai.com" or host.endswith(".openai.com")

Comment thread tests/test_minimax.py
Comment on lines +107 to +110
@pytest.mark.unit
class TestNvidiaHostedMinimaxCapabilities:
def test_nvidia_hosted_minimax_m2_7_uses_minimax_thinking_capabilities(self):
assert get_capabilities("minimaxai/minimax-m2.7") == _MINIMAX_THINKING

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Rename the test class to a more general name and add test cases for hosted DeepSeek models to ensure the generalized capability pattern matching works correctly and prevents future regressions.

Suggested change
@pytest.mark.unit
class TestNvidiaHostedMinimaxCapabilities:
def test_nvidia_hosted_minimax_m2_7_uses_minimax_thinking_capabilities(self):
assert get_capabilities("minimaxai/minimax-m2.7") == _MINIMAX_THINKING
@pytest.mark.unit
class TestHostedModelCapabilities:
def test_nvidia_hosted_minimax_m2_7_uses_minimax_thinking_capabilities(self):
assert get_capabilities("minimaxai/minimax-m2.7") == _MINIMAX_THINKING
def test_hosted_deepseek_uses_deepseek_thinking_capabilities(self):
from tradingagents.llm_clients.capabilities import _DEEPSEEK_THINKING
assert get_capabilities("deepseek-ai/deepseek-v3") == _DEEPSEEK_THINKING
assert get_capabilities("nvidia/deepseek-r1") == _DEEPSEEK_THINKING

@Yijia-Xiao

Copy link
Copy Markdown
Member

Thanks @MaH-coding. This distinction — native OpenAI hosts on the Responses API, OpenAI-compatible endpoints on Chat Completions — is now handled by the provider registry that drives every OpenAI-compatible provider. Closing as superseded; appreciate flagging it.

@Yijia-Xiao Yijia-Xiao closed this Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants