fix: fix the bug of online serving by Benedict-Y · Pull Request #45 · Kwai-Keye/Keye

Benedict-Y · 2025-10-09T08:19:12Z

PR: Fix TypeError in Online Serving + vLLM video inference parameters

Hi Kwai Team, thank you for the great work and for reviewing this PR.

Related issue: Bug: TypeError: process_vision_info() got an unexpected keyword argument 'return_video_kwargs' #42

Summary

Fix a TypeError in the Online Serving demo caused by passing an unsupported return_video_kwargs argument to process_vision_info.
Add recommended vLLM serve flags for stable video inference.

Changes

Remove return_video_kwargs=True from the demo/serving code.

Keep the call as:

image_inputs, video_inputs, video_kwargs = process_vision_info(video_message)

The function already returns video_kwargs.

Rationale

process_vision_info in src/keye_vl_utils/vision_process.py doesn’t accept return_video_kwargs; passing it raises:
```
TypeError: process_vision_info() got an unexpected keyword argument 'return_video_kwargs'
```

vLLM Notes (to avoid freezes with long videos)

vllm serve Kwai-Keye/Keye-VL-8B-Preview \
  --tensor-parallel-size 1 \
  --enable-prefix-caching \
  --gpu-memory-utilization 0.6 \
  --host 0.0.0.0 \
  --port 8000 \
  --max-num-batched-tokens 80960 \
  --max-model-len 80960 \
  --trust-remote-code

--max-num-batched-tokens 80960: avoids batching issues with long sequences
--max-model-len 80960: supports extended context from video frames

Testing

Re-ran the Online Serving demo with a sample video.
Verified:
- No TypeError occurs.
- Chat completion succeeds.

fix: fix the bug of online serving

46aa872

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fix the bug of online serving#45

fix: fix the bug of online serving#45
Benedict-Y wants to merge 1 commit intoKwai-Keye:mainfrom
Benedict-Y:fix/fix_bug_of_online_serving

Benedict-Y commented Oct 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Benedict-Y commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR: Fix TypeError in Online Serving + vLLM video inference parameters

Summary

Changes

Rationale

vLLM Notes (to avoid freezes with long videos)

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Benedict-Y commented Oct 9, 2025 •

edited

Loading