[Fix] fix SP for InternS1 VL RL by tina-wen · Pull Request #1656 · InternLM/xtuner

tina-wen · 2026-04-06T16:22:32Z

Root Cause

Under sequence parallelism, entropy statistics and GRPO batch loss calibration were computed from sharded tensors, which could produce incorrect token counts and inconsistent metrics across SP ranks. In addition, GRPO loss batching needed to accept the forwarded SP context from the worker.

Fix

Gather shifted labels and logprobs before entropy aggregation under SP, pass sp_mesh into RL loss batching from the training worker, and make GRPO batch construction use the SP-aware token counting path.

[Fix] fix SP for VL-241B RL

63622fa

tina-wen changed the title ~~[Fix] fix SP for VL-241B RL~~ [Fix] fix SP for InternS1 VL RL Apr 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] fix SP for InternS1 VL RL#1656

[Fix] fix SP for InternS1 VL RL#1656
tina-wen wants to merge 1 commit intoInternLM:mainfrom
tina-wen:rl_sp

tina-wen commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tina-wen commented Apr 6, 2026

Root Cause

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant