vllm
Fix/resupport nongated fused moe triton
#36412
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
24
Changes
View On
GitHub
Fix/resupport nongated fused moe triton
#36412
vllm-bot
merged 24 commits into
vllm-project:main
from
shaunkotek:fix/resupport-nongated-fused-moe-triton
shaunkotek
marked this pull request as ready for review
62 days ago
shaunkotek
requested a review
from
mgoin
62 days ago
shaunkotek
requested a review
from
pavanimajety
62 days ago
gemini-code-assist
commented on 2026-03-08
mgoin
approved these changes on 2026-03-08
mgoin
added
ready
mgoin
added
nvidia
resupport non gated fused moe in triton
c1126dba
[Model] Nano Nemotron VL - fast media preprocessing (#35657)
bb679b1a
[Frontend] Add GPU-less render serving path (`vllm launch render`) (#…
570bf720
Add support for ModelOpt MXFP8 MoE models (#35986)
730745c2
[cudagraph] fix cudagraph warning in deepseekv32 (#28044)
5a47a952
[XPU][Doc] update xpu document about triton dependency/conflict issue…
d6ec74d7
Allow `markdownlint` to run locally (#36398)
ad818e78
[Dependency] Remove default ray dependency (#36170)
6a392273
[Bugfix] Fix CPU OMP autobind assertion to use local_world_size (#35815)
6cefd2bb
[Examples][1/n] Resettle basic examples. (#35579)
6ffbf5c6
fix: Use iterator as not to store all the file loads in memory at onc…
fa16442a
Increase Flexibility for OOV Multimodal Token Handling (#34858)
1bca5351
[Misc] Move processors to `transformers_utils` (#35953)
c7ebbe70
feat(attention): extract KV-cache update from FlexAttention backend (…
4fdaf0ad
[Bugfix] Skip out-of-stage layers in get_layers_from_vllm_config for …
18c8fb96
[Frontend][2/n] Improve pooling entrypoints | embed. (#36110)
cda0a73c
[Bugfix] Avoid to replace non-tensor members in cpu model runner (#36…
b3b43160
[Frontend] Add Support for MM Encoder/Decoder Beam Search (Online Tra…
97c8f800
[XPU] Add test script of PD disaggregation (#36434)
a6291dfc
[Kernel] Add fused_sigmoid_gating_delta_rule_update kernel for Qwen3 …
2bfb5759
[Deprecation][1/2] Remove items deprecated in v0.18 (#36470)
aae9f536
[ci] Bound openai dependency to 2.24.0 (#36471)
9bbc5e63
[MM Encoder] Default to use TORCH_SDPA backend for ViT on Volta/Turin…
e4b4a459
shaunkotek
force pushed
to
e4b4a459
61 days ago
shaunkotek
requested a review
from
noooop
61 days ago
shaunkotek
requested a review
from
tjtanaa
61 days ago
shaunkotek
requested a review
from
patrickvonplaten
61 days ago
shaunkotek
requested a review
from
sighingnow
61 days ago
shaunkotek
requested a review
from
bigPYJ1151
61 days ago
shaunkotek
requested a review
from
hmellor
61 days ago
shaunkotek
requested a review
from
ApostaC
61 days ago
shaunkotek
requested a review
from
orozery
61 days ago
shaunkotek
requested a review
from
tlrmchlsmth
61 days ago
shaunkotek
requested a review
from
WoosukKwon
61 days ago
shaunkotek
requested a review
from
yewentao256
61 days ago
shaunkotek
requested a review
from
DarkLight1337
61 days ago
shaunkotek
requested a review
from
robertgshaw2-redhat
61 days ago
shaunkotek
requested a review
from
aarnphm
61 days ago
shaunkotek
requested a review
from
NickLucche
61 days ago
shaunkotek
requested a review
from
njhill
61 days ago
shaunkotek
requested a review
from
LucasWilkinson
61 days ago
shaunkotek
requested a review
from
MatthewBonanni
61 days ago
shaunkotek
requested a review
from
chaunceyjiang
61 days ago
shaunkotek
requested a review
from
russellb
61 days ago
shaunkotek
requested a review
from
youkaichao
61 days ago
shaunkotek
requested a review
from
houseroad
61 days ago
shaunkotek
requested a review
from
ProExpertProg
61 days ago
shaunkotek
requested a review
from
ywang96
61 days ago
shaunkotek
requested a review
from
22quinn
61 days ago
shaunkotek
requested a review
from
jeejeelee
61 days ago
shaunkotek
requested a review
from
zou3519
61 days ago
shaunkotek
requested a review
from
BoyuanFeng
61 days ago
mergify
added
documentation
mergify
added
ci/build
mergify
added
frontend
mergify
added
multi-modality
mergify
added
performance
mergify
added
qwen
mergify
added
rocm
mergify
added
cpu
mergify
added
speculative-decoding
mergify
added
v1
mergify
added
kv-connector
Merge branch 'main' into fix/resupport-nongated-fused-moe-triton
3614f5ad
vllm-bot
merged
fa028207
into main
61 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
mgoin
gemini-code-assist
pavanimajety
noooop
tjtanaa
patrickvonplaten
sighingnow
bigPYJ1151
hmellor
ApostaC
orozery
tlrmchlsmth
WoosukKwon
yewentao256
DarkLight1337
robertgshaw2-redhat
aarnphm
NickLucche
njhill
LucasWilkinson
MatthewBonanni
chaunceyjiang
russellb
youkaichao
houseroad
ProExpertProg
ywang96
22quinn
jeejeelee
zou3519
BoyuanFeng
Assignees
No one assigned
Labels
documentation
performance
rocm
frontend
speculative-decoding
ready
ci/build
v1
multi-modality
qwen
cpu
kv-connector
nvidia
Milestone
No milestone
Login to write a write a comment.
Login via GitHub