qwen3 omni support long audio by WANDY666 · Pull Request #1268 · ModelTC/LightLLM

WANDY666 · 2026-04-09T07:09:56Z

No description provided.

… to remove first-request audio cold- start latency.

…ize_omni_merge # Conflicts: # lightllm/models/qwen3_omni_moe_thinker/qwen3_omni_audio.py # lightllm/models/whisper/whisper_audio.py # lightllm/server/api_start.py # lightllm/server/audioserver/manager.py # lightllm/server/audioserver/model_infer/model_rpc.py # lightllm/server/httpserver/manager.py # lightllm/utils/multimodal_utils.py

…current requests

gemini-code-assist

Code Review

This pull request updates the Qwen3 Omni MoE Thinker model by disabling default audio truncation, simplifying token length calculations, and making the convolution chunk size configurable via environment variables. It also adds inference mode decorators and a new check_long_audio_infer method to validate long audio processing during initialization. Feedback was provided to use the model parameters' actual dtype in check_long_audio_infer to avoid potential type mismatches after model casting.

lightllm/models/qwen3_omni_moe_thinker/qwen3_omni_audio.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

WANDY666 and others added 30 commits March 26, 2026 08:17

qwen3_vl_moe support prefill_cudagraph

c8b3888

add audio dp

e7fba3a

Add startup warmups for HTTP audio preload and per-rank audio workers…

671b5aa

… to remove first-request audio cold- start latency.

add http client cache

a387259

reduce polling time

cd89cd6

Optimize audio shm payload handling and cache lookups

4788980

cache hann_window/mel_filters

7b05403

Fix audio preload config to follow tokenizer settings

713c45d

Optimize qwen3 omni audio preprocessing fast path

65a3ec6

Add audio server fast path for single pending requests

2e48008

fix num_frames

456a71a

tune fp8

479367d

set default model

2c09aa2

add prompt_text_cache to QWen3OmniTokenizer

5168dae

multi images or audios use asyncio

167f8b0

single file without _resource_lock

30d8603

use deque instead of list

db3e63b

chore: format merged audio/httpserver files

878c2f9

chore: improve qwen3 omni audio formatting

ab788d9

fixâ�

0570b96

fix

70aad72

fix md5 and

86a16f7

fix md5

4601637

format

16203e4

using asyncio.to_thread preventing the server from handling other con…

93421d2

…current requests

fix

f7b0589

fix

0ea2156

fix

6856540

use details_log to log

9d0671b

WANDY666 and others added 22 commits April 7, 2026 13:40

delete warmup

8e21207

delete audio_preload_config

fe39faa

delete _preprocess_single_padded

f1c9f07

fix

9bee105

fix

6c9c490

fix

3b057d0

fix

a8a8130

fix

4479a65

fix

be59513

fix

3b0e613

fix

56af31d

fix

4a61198

fix

ccd4b57

fix

b7d1187

fix

40cd0b9

fix

284815f

fix

fa11c53

fix

44c63d9

support long audio

c5cc995

add check_long_audio_infer

eb4558a

add LIGHTLLM_QWEN3_OMNI_CONV_CHUNKSIZE

94ffee1

merge

73ece22

gemini-code-assist bot reviewed Apr 9, 2026

View reviewed changes

lightllm/models/qwen3_omni_moe_thinker/qwen3_omni_audio.py Show resolved Hide resolved

Apply suggestions from code review. Use params.dtype

0553276

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen3 omni support long audio#1268

qwen3 omni support long audio#1268
WANDY666 wants to merge 53 commits intomainfrom
lightllm-long-audio

WANDY666 commented Apr 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

WANDY666 commented Apr 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants