-
Notifications
You must be signed in to change notification settings - Fork 117
[Feature] Qwen3 VL eagle3 support #251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Training finishes for tp size 1 and both sdpa/flex attention backends. Loss/acc curves look ok, going to add qwen3_vl_moe eagle3 support into sglang/vllm (so I can eval) before adding tp size > 1 support. |
47ac5c9 to
1f08199
Compare
|
Marking this ready for review, adds Training graphs: |
|
When will verification be supported? |
There's a branch of sglang here that you can run. It's currently being cleaned up for upstreaming. We were able to confirm an accept length of almost |
|
Can you rebase this PR to the latest main? |
yes I'll get to it this week |
|
sorry for the delay @FrankLeeeee I finished the rebase |
|
@KerwinKai this PR seems to overlap with yours, can you take a look? |
Yes, I'll try. |
|
` elif ( Initialize the target model using Qwen3VLForConditionalGeneration from the Transformers library, but the class definition does not include set_aux_hidden_states_layers, causing the error: How should I modify Qwen3VLForConditionalGeneration? I noticed there is a class Eagle3TargetModel(ABC), but I’m not sure how to use it. |
you should specify backend as |
Thanks a lot! But I finally resolved the 'set_aux_hidden_states_layers' issue by |
Motivation
** Draft PR. This is currently WIP **
Add eagle3 support for qwen3_vl and qwen3_vl_moe models.
Modifications
Related Issues
Accuracy Test
Benchmark & Profiling
Checklist