-
Notifications
You must be signed in to change notification settings - Fork 655
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: change debug log from ERROR to DEBUG in RepetitionPenaltyKernel
#4363
opened Feb 15, 2026 by
murray-macdonald
Loading…
ci(lint): skip flaky deadlink test for python wiki page
#4357
opened Feb 13, 2026 by
windreamer
Loading…
Fix XGrammar bitmask initialization and add null check for gen_config in generate method
#4349
opened Feb 11, 2026 by
windreamer
Loading…
add preliminary support for EP(single-node) of turbomind backend
#4332
opened Feb 6, 2026 by
irexyc
Loading…
Qwen/Internlm/Llama Dense/Moe model fp8 quant online
enhancement
New feature or request
#4324
opened Feb 5, 2026 by
43758726
Loading…
Compatible with transformers 5.0 at TurboMind side
improvement
#4304
opened Jan 28, 2026 by
lvhan028
Loading…
change ascend paged attention from BSH format to TND format for better performace
#4295
opened Jan 27, 2026 by
jinminxi104
•
Draft
support repetition ngram logits processor
enhancement
New feature or request
#4288
opened Jan 23, 2026 by
grimoire
Loading…
Support fp32 head for qwen and internlm models
improvement
#4160
opened Nov 27, 2025 by
RunningLeon
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.