Skip to content

qwen3 omni support long audio#1268

Open
WANDY666 wants to merge 53 commits intomainfrom
lightllm-long-audio
Open

qwen3 omni support long audio#1268
WANDY666 wants to merge 53 commits intomainfrom
lightllm-long-audio

Conversation

@WANDY666
Copy link
Copy Markdown
Contributor

@WANDY666 WANDY666 commented Apr 9, 2026

No description provided.

WANDY666 and others added 30 commits March 26, 2026 08:17
… to remove first-request audio cold-

  start latency.
…ize_omni_merge

# Conflicts:
#	lightllm/models/qwen3_omni_moe_thinker/qwen3_omni_audio.py
#	lightllm/models/whisper/whisper_audio.py
#	lightllm/server/api_start.py
#	lightllm/server/audioserver/manager.py
#	lightllm/server/audioserver/model_infer/model_rpc.py
#	lightllm/server/httpserver/manager.py
#	lightllm/utils/multimodal_utils.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Qwen3 Omni MoE Thinker model by disabling default audio truncation, simplifying token length calculations, and making the convolution chunk size configurable via environment variables. It also adds inference mode decorators and a new check_long_audio_infer method to validate long audio processing during initialization. Feedback was provided to use the model parameters' actual dtype in check_long_audio_infer to avoid potential type mismatches after model casting.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants