Skip to content

fix(responses): merge leading system messages in user_note_safe fallback#1923

Merged
jundot merged 1 commit into
jundot:mainfrom
imi4u36d:fix/responses-merge-leading-system-messages
Jun 22, 2026
Merged

fix(responses): merge leading system messages in user_note_safe fallback#1923
jundot merged 1 commit into
jundot:mainfrom
imi4u36d:fix/responses-merge-leading-system-messages

Conversation

@imi4u36d

Copy link
Copy Markdown
Contributor

Problem

When Codex App Desktop sends /v1/responses requests with both instructions and a system/developer message in input, two leading system messages are produced. If the conversation also contains a mid-system message (e.g. via previous_response_id chain or multi-turn input), the user_note_safe downgrade path in prepare_system_messages_for_template preserves the leading system block as-is without merging.

Since _merge_consecutive_roles only merges user/assistant roles, the two consecutive system messages survive into the chat template. Strict templates like Qwen3.6 validate that system messages can only appear at the first position, causing:

POST /v1/responses -> 400: Chat template error: System message must be at the beginning.

Root Cause

In omlx/api/utils.py, _downgrade_mid_system_to_user_notes preserves leading system blocks via rewritten.extend(messages[start:i]) at line 601-602. The caller unsupported_fallback then only applies _merge_consecutive_roles, which doesn't touch system messages.

Fix

Call _merge_consecutive_system_messages in unsupported_fallback before _merge_consecutive_roles when the user_note_safe path succeeds. This merges consecutive leading system messages into one, satisfying strict template validation.

Reproduction

from omlx.api.utils import prepare_system_messages_for_template

class Qwen36Tokenizer:
    def apply_chat_template(self, messages, **kw):
        for i, msg in enumerate(messages):
            if msg.get('role') == 'system' and i > 0:
                raise Exception('System message must be at the beginning.')
        return 'ok'

messages = [
    {"role": "system", "content": "You are a helpful assistant"},  # from instructions
    {"role": "system", "content": "Be concise"},                   # from input
    {"role": "user", "content": "Hello"},
    {"role": "system", "content": "Remember this"},                # mid-system
    {"role": "user", "content": "Continue"},
]

result = prepare_system_messages_for_template(
    messages, Qwen36Tokenizer(), unsupported_mid_system_policy="user_note_safe")
# Before fix: ['system', 'system', 'user'] → template error
# After fix:  ['system', 'user']           → OK

Closes #1908

When `_downgrade_mid_system_to_user_notes` succeeds, it preserves
leading system blocks as-is (`rewritten.extend(messages[start:i])`).
If there are multiple leading system messages (e.g. `instructions` +
input `system`/developer message), they survive as separate messages.
The subsequent `_merge_consecutive_roles` only merges user/assistant
roles, leaving consecutive system messages unmerged.

Strict chat templates like Qwen3.6 validate that system messages only
appear at the first position, so two leading system messages trigger:
  "System message must be at the beginning."

Fix: call `_merge_consecutive_system_messages` in `unsupported_fallback`
before `_merge_consecutive_roles` when the user_note_safe path succeeds.

Closes jundot#1908
@jundot

jundot commented Jun 22, 2026

Copy link
Copy Markdown
Owner

Thanks for the fix. The change is scoped to the user_note_safe fallback and uses the existing system-message merge path after mid-system notes are downgraded, so it addresses #1908 without changing the native-preserve path. I verified the API utility tests locally and CI is green. This looks good to me, and I'm going to merge it.

@jundot jundot merged commit 8663a26 into jundot:main Jun 22, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: /v1/responses adapter prepends instructions without deduplicating system messages, triggering strict chat template validation

2 participants