vllm
[Perf][fp8] Use CustomOp abstraction for fp8 quant for better perf
#19830
Merged

[Perf][fp8] Use CustomOp abstraction for fp8 quant for better perf #19830

ProExpertProg
github-actions
mergify mergify added v1
gemini-code-assist
gemini-code-assist commented on 2025-06-19
gemini-code-assist
gemini-code-assist commented on 2025-06-19
mergify
mergify mergify added needs-rebase
ProExpertProg ProExpertProg force pushed 331 days ago
mergify mergify removed needs-rebase
mergify mergify added performance
ProExpertProg ProExpertProg force pushed 313 days ago
ProExpertProg ProExpertProg force pushed 313 days ago
ProExpertProg ProExpertProg marked this pull request as ready for review 313 days ago
ProExpertProg ProExpertProg requested a review from mgoin mgoin 313 days ago
ProExpertProg ProExpertProg requested a review from robertgshaw2-redhat robertgshaw2-redhat 313 days ago
ProExpertProg ProExpertProg requested a review from tlrmchlsmth tlrmchlsmth 313 days ago
ProExpertProg ProExpertProg changed the title FP8 custom ops [Perf][fp8] Use CustomOp abstraction for fp8 quant for better perf 313 days ago
ProExpertProg
gemini-code-assist
gemini-code-assist commented on 2025-07-08
LucasWilkinson LucasWilkinson requested a review from LucasWilkinson LucasWilkinson 313 days ago
mgoin
mgoin commented on 2025-07-09
ProExpertProg ProExpertProg requested a review from WoosukKwon WoosukKwon 312 days ago
ProExpertProg ProExpertProg requested a review from zhuohan123 zhuohan123 312 days ago
ProExpertProg ProExpertProg requested a review from youkaichao youkaichao 312 days ago
ProExpertProg ProExpertProg requested a review from alexm-redhat alexm-redhat 312 days ago
ProExpertProg ProExpertProg requested a review from comaniac comaniac 312 days ago
ProExpertProg ProExpertProg requested a review from njhill njhill 312 days ago
mergify mergify added rocm
LucasWilkinson
LucasWilkinson approved these changes on 2025-07-10
LucasWilkinson LucasWilkinson added ready
ProExpertProg ProExpertProg force pushed 311 days ago
yewentao256
yewentao256 commented on 2025-07-10
mgoin Michael changes
00f3924e
ProExpertProg Cleanup constants/utils
fcdfe982
ProExpertProg Add QuantFP8 (CustomOp subclass) and use it for FP8 GEMMs. Also fix f…
b79fb845
ProExpertProg Move GroupShape to quant_utils.py
4d4e31ae
ProExpertProg Remove old CustomOp classes, fix pre-commit
2f32e8f5
ProExpertProg fix pre-commit try 2
80405f56
ProExpertProg gemini feedback
5650acbb
ProExpertProg add issue for MoE
6abccfe2
ProExpertProg move file
49283d4c
ProExpertProg PR feedback
dc82a0bb
ProExpertProg Use GroupShape in fusion
dc97cd02
ProExpertProg Add quant_fp8 to fix fusion test
69cc866b
ProExpertProg Compilation fixes
404a4b2f
ProExpertProg ProExpertProg force pushed to 404a4b2f 311 days ago
mgoin
mgoin approved these changes on 2025-07-10
mgoin mgoin enabled auto-merge (squash) 311 days ago
mgoin Merge branch 'main' into luka/fp8-custom-ops
47a25b5c
mgoin mgoin merged 31d5c179 into main 311 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone