vllm
[Quantization] Add compressed-tensors NVFP4 support
#18312
Merged

[Quantization] Add compressed-tensors NVFP4 support #18312

dsikka
github-actions
dsikka dsikka marked this pull request as ready for review 261 days ago
dsikka dsikka requested a review from mgoin mgoin 261 days ago
dsikka dsikka requested a review from robertgshaw2-redhat robertgshaw2-redhat 261 days ago
dsikka dsikka requested a review from tlrmchlsmth tlrmchlsmth 261 days ago
dsikka
mgoin
mgoin commented on 2025-05-18
mgoin mgoin added quantization
mgoin mgoin added ready
robertgshaw2-redhat
dsikka dsikka force pushed 259 days ago
dsikka dsikka changed the title [Quantization] Add compressed-tensors NVFP4 emulation support [Quantization] Add compressed-tensors NVFP4 support 259 days ago
mgoin
dsikka
mgoin
mgoin approved these changes on 2025-05-21
DarkLight1337
dsikka dsikka marked this pull request as draft 255 days ago
dsikka
dsikka dsikka force pushed 241 days ago
dsikka dsikka marked this pull request as ready for review 241 days ago
dsikka dsikka requested a review from mgoin mgoin 241 days ago
mgoin
mgoin approved these changes on 2025-06-06
mgoin mgoin enabled auto-merge (squash) 241 days ago
dsikka add ct nvfp4 emulation support
1ddbee3e
dsikka fix conditions; add test models
d15990a4
dsikka add cutlass support
d4471543
dsikka clean-up
82a26e05
dsikka update
3d4a7b23
dsikka fix condition; use strategy/compression format enums
362517b3
dsikka update
d729e08f
dsikka remove incorrect check
f716a340
disabled auto-merge 240 days ago
Head branch was pushed to by a user without write access
dsikka dsikka force pushed to f716a340 240 days ago
mgoin mgoin merged c123bc33 into main 239 days ago
codelayout
dsikka
codelayout
PhzCode
dsikka

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone