DeepSpeed
c5e48f49 - Add fp8_gemm fallback for non-triton systems (#6916)

Commit
347 days ago
Add fp8_gemm fallback for non-triton systems (#6916) - Removed try/except from __init__ file in fp_quantizer and added a single entry point instead - Renamed file fp8_gemm to fp8_gemm_triton, and the function matmul_fp8 to matmul_fp8_triton - Added a new entry point fp8_gemm with matmul_fp8 inside, and if the system supports triton it calls the triton implementation and if not it calls the fallback Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Author
Parents
Loading