vllm
c11de33d
- [Bugfix][Kernel] Fix per-token/per-channel quantization for Hopper scaled mm (#12696)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
317 days ago
[Bugfix][Kernel] Fix per-token/per-channel quantization for Hopper scaled mm (#12696) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>
References
#12696 - [Bugfix][Kernel] Fix per-token/per-channel quantization for Hopper scaled mm
Author
tlrmchlsmth
Parents
33e0602e
Loading