vllm
68c4421b - [AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD (#12282)

Comment changes are shownComment changes are hidden
Commit
155 days ago
[AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD (#12282) Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Author
Parents
  • tests/kernels
    • File
      test_triton_scaled_mm.py
  • vllm/model_executor/layers/quantization/kernels/scaled_mm
    • File
      __init__.py
    • File
      triton.py