text-generation-inference
cb150eb2 - Add support for FP8 on compute capability >=8.0, <8.9 (#2213)

Commit
1 year ago
Add support for FP8 on compute capability >=8.0, <8.9 (#2213) Use FP8 GPTQ-Marlin kernels to enable FP8 support on CUDA GPUs with compute capability >=8.0 and <8.9. Co-authored-by: Florian Zimmermeister <flozi00.fz@gmail.com>
Author
Parents
Loading