vllm
c11de33d - [Bugfix][Kernel] Fix per-token/per-channel quantization for Hopper scaled mm (#12696)

Commit

317 days ago

[Bugfix][Kernel] Fix per-token/per-channel quantization for Hopper scaled mm (#12696) Signed-off-by: Tyler Michael Smith <tyler@neuralmagic.com>

References

Author

tlrmchlsmth

Parents