vllm
89342ce4 - [Quantization] [Performance] Enable Marlin GEMM kernels for the calibration-free RTN-based quantization (#26051)

Commit

87 days ago

[Quantization] [Performance] Enable Marlin GEMM kernels for the calibration-free RTN-based quantization (#26051) Signed-off-by: Alex Kogan <alex.kogan@oracle.com> Signed-off-by: Alex Kogan <82225080+sakogan@users.noreply.github.com>

References

#26051 - [Quantization] [Performance] Enable Marlin GEMM kernels for the calibration-free RTN-based quantization

Author

sakogan

Parents

f89f5993

vllm 89342ce4 - [Quantization] [Performance] Enable Marlin GEMM kernels for the calibration-free RTN-based quantization (#26051)

vllm
89342ce4 - [Quantization] [Performance] Enable Marlin GEMM kernels for the calibration-free RTN-based quantization (#26051)