vllm
Prototyping single batch overlapping for Deepseek EP
#29211
Open

Prototyping single batch overlapping for Deepseek EP #29211

mxz297 wants to merge 26 commits into vllm-project:main from mxz297:sbo_cutedsl_deepepll
mxz297
wenscarl Add flashinfer_cutedsl grouped gemm
c0639113
wenscarl Make fused version work with cuda graph
8a224dae
wenscarl fix pre-commit
ec6acfdb
wenscarl Update test
2a31b4cf
wenscarl Update test
ad67ea93
Add DeepEP LL nvfp4 dispatch.
65548dd5
wenscarl Merge remote-tracking branch 'origin/main' into fp4dispatch
eff9ea06
wenscarl Merge branch 'main' into cutedsl_grp_gemm
6ccf1108
wenscarl Merge remote-tracking branch 'origin/main' into cutedsl_grp_gemm
26068065
wenscarl Merge remote-tracking branch 'origin/main' into fp4dispatch
35cb3ff2
wenscarl Fix pre-commit
8b917f9c
wenscarl Fix pre-commit
365a8ff9
wenscarl Fix after refactor
b90f3473
wenscarl Merge branch 'cutedsl_grp_gemm' into fp4dispatch
a64fc288
wenscarl Add log
87d9ce6d
wenscarl Add flashinfer_cutedsl grouped gemm
0a831330
wenscarl Merge branch 'cutedsl_grp_gemm' into fp4dispatch
ceb9ec97
wenscarl Upd
30b829db
wenscarl Avoid nan by torch.ones
e1837e64
wenscarl Fix typo
a93ea3d0
mxz297 Merge branch 'main' into cutedsl
3762aa1c
mxz297 Prototyping SBO for cutedsl moe + deepep ll
b7b34e04
mergify mergify added deepseek
mergify mergify added nvidia
mergify
mergify mergify added needs-rebase
yewentao256
yewentao256 commented on 2025-11-22
mxz297 Fix VLLM_EP_USE_SBO=0 case
3e98ae79
mxz297 Use 56 SMs for deepep ll SBO
b052d4e8
mxz297 Use 32 comm sm
ba37d7f8
mxz297 Merge branch 'main' into sbo_cutedsl_deepepll
2d8c13b8
mergify mergify removed needs-rebase

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone