vllm
68c4421b
- [AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD (#12282)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
155 days ago
[AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD (#12282) Signed-off-by: Randall Smith <Randall.Smith@amd.com>
References
#12282 - [AMD][Quantization] Add TritonScaledMMLinearKernel since int8 is broken for AMD
Author
rasmith
Parents
aea94362
Files
3
tests/kernels
test_triton_scaled_mm.py
vllm/model_executor/layers/quantization/kernels/scaled_mm
__init__.py
triton.py
Loading