[Feature] support qwen3.5 chat template by hhaAndroid · Pull Request #1657 · InternLM/xtuner

hhaAndroid · 2026-04-07T12:19:01Z

随着 jinja 的 chat template 的复杂度逐渐增加，特别是 Agent 场合，传统的自己解析 jinja 模板并针对性设计 loss 的做法越来越无法满足需求，可维护性越来越差。

因此 XTuner 采用全新方案，不再自己解析 jinja 模板，而是优先调用 tokenizer.apply_chat_template 并针对性新增 loss 代码，从而实现一套通用的方案。

采用了新的方案后，数据格式需要改动，采用更通用的格式可以兼容 think 和非 think 模式，同时 toolcall 也可以方便支持。

…_orig_jinja

hhaAndroid · 2026-04-08T07:41:45Z

@claude review

xtuner/v1/data_proto/messages/qwen35_chat.py

tests/datasets/test_qwen35_vl_tokenize_fn.py

xtuner/v1/datasets/mllm_tokenize_fn/base_mllm_tokenize_fn.py

claude · 2026-04-08T07:47:04Z

Claude: ## Summary

Adds Qwen3.5 VL chat template support with a custom tokenization path (Qwen35ChatMessages, qwen35_tokenize_fn_fastspeed), including reasoning/thinking block handling, tool call formatting (Qwen3.5 XML style), and multimodal (image/video) token replacement. Also extends base_mllm_tokenize_fn to accept the new "image" / "video" content type keys alongside existing "image_url" / "video_url".

Issues

Critical

qwen35_chat.py:297-299 — Assertion message is a plain string, not an f-string. {len(text)} will be printed literally on failure instead of showing the actual lengths.

Warning

qwen35_chat.py:408-414 — tokenize() mutates self.messages in place (inserts/replaces system message). Calling it twice on the same object will produce incorrect results.
qwen35_chat.py:396 — Qwen35ChatMessages extends BaseModel directly instead of BaseMessages, breaking the interface contract for the messages module.
qwen35_chat.py — Multiple public functions missing return type annotations (project requires type hints per CLAUDE.md).
qwen35_chat.py:295 — qwen35_process_text_and_loss_mask appears unused and duplicates logic already in qwen35_tokenize_fn_fastspeed.

Nit

Missing copyright header in qwen35_chat.py
Debug print statement left in test_qwen35_vl_tokenize_fn.py:249
base_mllm_tokenize_fn.py — repeated key-extraction pattern for "image_url"/"image" and "video_url"/"video" could be simplified

Verdict

REQUEST_CHANGES

hhaAndroid · 2026-04-08T09:10:49Z

@claude review

claude · 2026-04-08T09:18:33Z

xtuner/v1/data_proto/messages/qwen35_chat.py

+            if chat_template.default_system is not None:
+                if self.messages[0]["role"] == "system":
+                    self.messages[0]["content"] = chat_template.default_system
+                else:
+                    self.messages.insert(0, {"role": "system", "content": chat_template.default_system})


Claude: Warning: tokenize() still mutates self.messages in place — this was flagged in the previous review and remains unresolved.

When chat_template.default_system is not None and messages[0] is not a system message, self.messages.insert(0, ...) permanently modifies the object's state. If tokenize() is called a second time the inserted system message would already be there and its content would be overwritten rather than inserted again — but the original caller's data has been silently altered.

Suggested fix — operate on a shallow copy:

messages = [m.copy() for m in self.messages] if chat_template.default_system is not None: if messages[0]["role"] == "system": messages[0]["content"] = chat_template.default_system else: messages.insert(0, {"role": "system", "content": chat_template.default_system}) token_ids, label_ids = qwen35_tokenize_fn_fastspeed( messages, tokenizer, self.tools, add_vision_id=add_vision_id, return_labels=True )

@claude 不需要考虑这种情况

claude · 2026-04-08T09:18:51Z

Claude: ## Re-review Summary

Thanks for the updates. Several issues from the first review have been addressed:

Fixed: Copyright header added (patch 4)
Fixed: Unused qwen35_process_text_and_loss_mask removed (patch 4)
Fixed: Assertion f-string bug resolved (function removed)
Fixed: base_mllm_tokenize_fn.py key-extraction simplified (patch 4)
Acknowledged: Qwen35ChatMessages not extending BaseMessages — comment added explaining the intentional design decision

Remaining Issues

Warning

tokenize() mutates self.messages in place (posted inline comment). When default_system is set, the method inserts/replaces system messages on the original list. Work on a copy instead.

Nit

Debug print at test_qwen35_vl_tokenize_fn.py:249 — print(pixel_values_xtuner.size(), image_grid_thw_xtuner, text.count('seconds>'), 'xxx') should be removed.
Missing return type hints on public functions — get_offset_mapping, render_content, qwen35_tokenize_fn_fastspeed, qwen35_tokenize_fn_slowspeed still lack annotations per project standards.

Verdict

REQUEST_CHANGES — the in-place mutation bug in tokenize() should be fixed before merge.

hhaAndroid · 2026-04-08T10:10:18Z

@claude review

claude · 2026-04-08T10:15:07Z

xtuner/v1/data_proto/messages/qwen35_chat.py

+        encoded = tokenizer(
+            result,
+            return_offsets_mapping=True,
+            add_special_tokens=False,
+        )
+        input_ids = encoded["input_ids"]


Claude: Warning: The except Exception catch is overly broad. Per CLAUDE.md: "Never use bare except: — always catch specific exception types." While this is Exception rather than bare except, it still silently swallows errors like TypeError, ValueError, or KeyError that could indicate real bugs in the data or tokenizer usage.

If the intent is to handle tokenizers that don't support return_offsets_mapping, catch the specific exception that raises (e.g., NotImplementedError or TypeError):

Suggested change

encoded = tokenizer(

result,

return_offsets_mapping=True,

add_special_tokens=False,

)

input_ids = encoded["input_ids"]

except (NotImplementedError, TypeError):

claude · 2026-04-08T10:15:10Z

xtuner/v1/data_proto/messages/qwen35_chat.py

+                last_query_index = i
+
+    # ── 主循环 ────────────────────────────────────────────────────────────
+    for idx, message in enumerate(messages):


Claude: Nit: The system message at idx=0 is already rendered in the pre-loop block (lines 141-155), but this line renders it again with do_vision_count=True. The result content is discarded (no branch handles role=="system" and is_first), but _render has a side effect — it increments image_count/video_count via nonlocal.

If a system message ever contains multimodal content (images/videos), these counters would be incremented spuriously, causing wrong counts for subsequent messages.

Consider skipping the render for the already-handled system message:

if role == "system" and is_first: continue content = _render(message["content"], True).strip()

xtuner/v1/data_proto/messages/qwen35_chat.py

claude · 2026-04-08T10:15:31Z

Claude: ## Re-review (Round 3) Summary

Most issues from previous reviews have been addressed — nice work on the cleanup across patches 2-5. The copyright header, unused function, debug prints, and key-extraction simplification are all resolved.

Remaining Issues

Warning

qwen35_chat.py:275-280 — except Exception is too broad; should catch the specific exception type for tokenizers that don't support return_offsets_mapping.
qwen35_chat.py — Public functions still missing return type annotations (carried over from prior review).

Nit

qwen35_chat.py:175 — First system message is rendered twice (pre-loop + main loop) with different do_vision_count flags, causing phantom side effects on image/video counters.

Verdict

COMMENT — the broad exception catch and missing type hints are worth addressing, but neither is a blocking correctness issue given the current test coverage. The phantom counter increment is theoretical (system messages are text-only in practice) but easy to fix.

hhaAndroid added 3 commits April 7, 2026 12:16

support qwen3.5 chat template

529acfe

Merge branch 'main' of https://github.com/InternLM/xtuner into qwen35…

5a754de

…_orig_jinja

update ci

eeff359