vllm
[Quantization] Add compressed-tensors NVFP4 support
#18312
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
8
Changes
View On
GitHub
[Quantization] Add compressed-tensors NVFP4 support
#18312
mgoin
merged 8 commits into
vllm-project:main
from
neuralmagic:nvfp4_emulation
dsikka
marked this pull request as ready for review
261 days ago
dsikka
requested a review
from
mgoin
261 days ago
dsikka
requested a review
from
robertgshaw2-redhat
261 days ago
dsikka
requested a review
from
tlrmchlsmth
261 days ago
mgoin
commented on 2025-05-18
mgoin
added
quantization
mgoin
added
ready
dsikka
force pushed
259 days ago
dsikka
changed the title
[Quantization] Add compressed-tensors NVFP4 emulation support
[Quantization] Add compressed-tensors NVFP4 support
259 days ago
mgoin
approved these changes on 2025-05-21
dsikka
marked this pull request as draft
255 days ago
dsikka
force pushed
241 days ago
dsikka
marked this pull request as ready for review
241 days ago
dsikka
requested a review
from
mgoin
241 days ago
mgoin
approved these changes on 2025-06-06
mgoin
enabled auto-merge (squash)
241 days ago
add ct nvfp4 emulation support
1ddbee3e
fix conditions; add test models
d15990a4
add cutlass support
d4471543
clean-up
82a26e05
update
3d4a7b23
fix condition; use strategy/compression format enums
362517b3
update
d729e08f
remove incorrect check
f716a340
disabled auto-merge
240 days ago
Head branch was pushed to by a user without write access
dsikka
force pushed
to
f716a340
240 days ago
mgoin
merged
c123bc33
into main
239 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
mgoin
robertgshaw2-redhat
tlrmchlsmth
Assignees
No one assigned
Labels
ready
Milestone
No milestone
Login to write a write a comment.
Login via GitHub