vllm
[Perf][fp8] Use CustomOp abstraction for fp8 quant for better perf
#19830
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
14
Changes
View On
GitHub
[Perf][fp8] Use CustomOp abstraction for fp8 quant for better perf
#19830
mgoin
merged 14 commits into
vllm-project:main
from
neuralmagic:luka/fp8-custom-ops
mergify
added
v1
gemini-code-assist
commented on 2025-06-19
gemini-code-assist
commented on 2025-06-19
mergify
added
needs-rebase
ProExpertProg
force pushed
331 days ago
mergify
removed
needs-rebase
mergify
added
performance
ProExpertProg
force pushed
313 days ago
ProExpertProg
force pushed
313 days ago
ProExpertProg
marked this pull request as ready for review
313 days ago
ProExpertProg
requested a review
from
mgoin
313 days ago
ProExpertProg
requested a review
from
robertgshaw2-redhat
313 days ago
ProExpertProg
requested a review
from
tlrmchlsmth
313 days ago
ProExpertProg
changed the title
FP8 custom ops
[Perf][fp8] Use CustomOp abstraction for fp8 quant for better perf
313 days ago
gemini-code-assist
commented on 2025-07-08
LucasWilkinson
requested a review
from
LucasWilkinson
313 days ago
mgoin
commented on 2025-07-09
ProExpertProg
requested a review
from
WoosukKwon
312 days ago
ProExpertProg
requested a review
from
zhuohan123
312 days ago
ProExpertProg
requested a review
from
youkaichao
312 days ago
ProExpertProg
requested a review
from
alexm-redhat
312 days ago
ProExpertProg
requested a review
from
comaniac
312 days ago
ProExpertProg
requested a review
from
njhill
312 days ago
mergify
added
rocm
LucasWilkinson
approved these changes on 2025-07-10
LucasWilkinson
added
ready
ProExpertProg
force pushed
311 days ago
yewentao256
commented on 2025-07-10
Michael changes
00f3924e
Cleanup constants/utils
fcdfe982
Add QuantFP8 (CustomOp subclass) and use it for FP8 GEMMs. Also fix f…
b79fb845
Move GroupShape to quant_utils.py
4d4e31ae
Remove old CustomOp classes, fix pre-commit
2f32e8f5
fix pre-commit try 2
80405f56
gemini feedback
5650acbb
add issue for MoE
6abccfe2
move file
49283d4c
PR feedback
dc82a0bb
Use GroupShape in fusion
dc97cd02
Add quant_fp8 to fix fusion test
69cc866b
Compilation fixes
404a4b2f
ProExpertProg
force pushed
to
404a4b2f
311 days ago
mgoin
approved these changes on 2025-07-10
mgoin
enabled auto-merge (squash)
311 days ago
Merge branch 'main' into luka/fp8-custom-ops
47a25b5c
mgoin
merged
31d5c179
into main
311 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
mgoin
LucasWilkinson
yewentao256
gemini-code-assist
robertgshaw2-redhat
tlrmchlsmth
WoosukKwon
zhuohan123
youkaichao
alexm-redhat
comaniac
njhill
Assignees
No one assigned
Labels
performance
rocm
ready
v1
Milestone
No milestone
Login to write a write a comment.
Login via GitHub