vllm
[Performance] Cublas Bf16 Gate with Fp32 Output
#35121
Merged

[Performance] Cublas Bf16 Gate with Fp32 Output #35121

roikoren755
roikoren755 roikoren755 requested a review from tlrmchlsmth tlrmchlsmth 9 days ago
roikoren755 roikoren755 requested a review from LucasWilkinson LucasWilkinson 9 days ago
roikoren755 roikoren755 requested a review from mgoin mgoin 9 days ago
roikoren755 roikoren755 requested a review from pavanimajety pavanimajety 9 days ago
mergify mergify added ci/build
mergify mergify added deepseek
gemini-code-assist
gemini-code-assist commented on 2026-02-23
robertgshaw2-redhat robertgshaw2-redhat changed the title Gate linear with fallback [Performance] Cublas Bf16 Gate with Fp32 Output 9 days ago
roikoren755 roikoren755 force pushed from 8bf87bf5 to 9e4bcb28 8 days ago
roikoren755 roikoren755 force pushed from 3b70bee2 to 50ecd354 8 days ago
roikoren755 roikoren755 force pushed from 50ecd354 to 25e2bf52 7 days ago
mgoin mgoin added performance
mgoin mgoin added nvidia
mgoin
mgoin commented on 2026-02-25
roikoren755 Initial router custom op commit
9de54f41
roikoren755 Fix missing import
eab812dd
roikoren755 CR
f70d1619
roikoren755 Use not-deprecated GEMM algo
0e62a05c
roikoren755 CR and fixing fallback
5e18e1be
roikoren755 roikoren755 force pushed from 25e2bf52 to 5e18e1be 6 days ago
mgoin
mgoin commented on 2026-02-26
mgoin mgoin added ready
vllm-bot vllm-bot merged 38c498b8 into main 5 days ago
roikoren755 roikoren755 deleted the feat/gate-linear-with-fallback branch 2 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone