vllm
Prototyping single batch overlapping for Deepseek EP
#29211
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
26
Changes
View On
GitHub
Prototyping single batch overlapping for Deepseek EP
#29211
mxz297
wants to merge 26 commits into
vllm-project:main
from
mxz297:sbo_cutedsl_deepepll
Add flashinfer_cutedsl grouped gemm
c0639113
Make fused version work with cuda graph
8a224dae
fix pre-commit
ec6acfdb
Update test
2a31b4cf
Update test
ad67ea93
Add DeepEP LL nvfp4 dispatch.
65548dd5
Merge remote-tracking branch 'origin/main' into fp4dispatch
eff9ea06
Merge branch 'main' into cutedsl_grp_gemm
6ccf1108
Merge remote-tracking branch 'origin/main' into cutedsl_grp_gemm
26068065
Merge remote-tracking branch 'origin/main' into fp4dispatch
35cb3ff2
Fix pre-commit
8b917f9c
Fix pre-commit
365a8ff9
Fix after refactor
b90f3473
Merge branch 'cutedsl_grp_gemm' into fp4dispatch
a64fc288
Add log
87d9ce6d
Add flashinfer_cutedsl grouped gemm
0a831330
Merge branch 'cutedsl_grp_gemm' into fp4dispatch
ceb9ec97
Upd
30b829db
Avoid nan by torch.ones
e1837e64
Fix typo
a93ea3d0
Merge branch 'main' into cutedsl
3762aa1c
Prototyping SBO for cutedsl moe + deepep ll
b7b34e04
mergify
added
deepseek
mergify
added
nvidia
mergify
added
needs-rebase
yewentao256
commented on 2025-11-22
Fix VLLM_EP_USE_SBO=0 case
3e98ae79
Use 56 SMs for deepep ll SBO
b052d4e8
Use 32 comm sm
ba37d7f8
Merge branch 'main' into sbo_cutedsl_deepepll
2d8c13b8
mergify
removed
needs-rebase
Login to write a write a comment.
Login via GitHub
Reviewers
yewentao256
Assignees
No one assigned
Labels
deepseek
nvidia
Milestone
No milestone
Login to write a write a comment.
Login via GitHub