vllm
[AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD
#12282
Merged

[AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD #12282

rasmith
rasmith TritonScaledMMLinearKernel implementation
9e8bad6c
rasmith rasmith requested a review from mgoin mgoin 326 days ago
rasmith rasmith requested a review from robertgshaw2-redhat robertgshaw2-redhat 326 days ago
rasmith rasmith requested a review from tlrmchlsmth tlrmchlsmth 326 days ago
github-actions
mgoin
mgoin approved these changes on 2025-01-21
robertgshaw2-redhat
robertgshaw2-redhat approved these changes on 2025-01-21
rasmith
mgoin
rasmith Add regression test for rocm w8a8
daf9a719
rasmith rasmith requested a review from WoosukKwon WoosukKwon 325 days ago
rasmith remote unused import
9c11d5c1
rasmith ruff
4e4d633e
rasmith
mgoin
mgoin approved these changes on 2025-01-22
mgoin
mgoin mgoin added quantization
mgoin mgoin added ready
mgoin mgoin enabled auto-merge (squash) 325 days ago
mgoin mgoin merged 68c4421b into main 324 days ago
gau-nernst
gau-nernst

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone