vllm
[AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD
#12282

Merged

[AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD #12282

mgoin merged 4 commits into vllm-project:main from rasmith:ransmith_triton_scaledmm_linear_kernel

TritonScaledMMLinearKernel implementation

9e8bad6c

rasmith requested a review from

mgoin 326 days ago

rasmith requested a review from

robertgshaw2-redhat 326 days ago

rasmith requested a review from

tlrmchlsmth 326 days ago

mgoin approved these changes on 2025-01-21

robertgshaw2-redhat approved these changes on 2025-01-21

Add regression test for rocm w8a8

daf9a719

rasmith requested a review from

WoosukKwon 325 days ago

remote unused import

9c11d5c1

ruff

4e4d633e

mgoin approved these changes on 2025-01-22

mgoin added quantization

mgoin added ready

mgoin enabled auto-merge (squash) 325 days ago

mgoin merged 68c4421b into main 324 days ago

Reviewers

mgoin

robertgshaw2-redhat

tlrmchlsmth

WoosukKwon

Assignees

No one assigned

Labels

quantization ready

Milestone

No milestone