vllm
[EP+DP] Optimize the little operations in the DeepGEMM + DeepEP low latency case
#19885
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
10
Changes
View On
GitHub
[EP+DP] Optimize the little operations in the DeepGEMM + DeepEP low latency case
#19885
WoosukKwon
merged 10 commits into
main
from
ll_deepgemm_opt
deep_ep + use_fp8_dispatch
8de2fd39
Merge remote-tracking branch 'nm/varun/deepep-fp8-dispatch' into ll_d…
104a984e
DeepGEMM LL optimizations
299f8291
fixes - use-fp8-dispatch
2b5ad9f2
gemini-code-assist
commented on 2025-06-20
mergify
added
qwen
gemini-code-assist
commented on 2025-06-20
mgoin
approved these changes on 2025-06-20
mgoin
added
deepseek
mgoin
added
performance
Unit test
d5f20676
fixes
26fd8ca3
precommit
7a821f0e
tlrmchlsmth
requested a review
from
WoosukKwon
339 days ago
tweaks
39d5d33f
fixup
21ffc735
tlrmchlsmth
enabled auto-merge (squash)
339 days ago
github-actions
added
ready
tolerances
b4f17e12
disabled auto-merge
336 days ago
Manually disabled by user
WoosukKwon
merged
68aaeb37
into main
336 days ago
WoosukKwon
deleted the ll_deepgemm_opt branch
336 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
mgoin
gemini-code-assist
WoosukKwon
Assignees
No one assigned
Labels
performance
ready
qwen
deepseek
Milestone
No milestone
Login to write a write a comment.
Login via GitHub