DeepSpeed
c5e48f49 - Add fp8_gemm fallback for non-triton systems (#6916)

Commit

347 days ago

Add fp8_gemm fallback for non-triton systems (#6916) - Removed try/except from __init__ file in fp_quantizer and added a single entry point instead - Renamed file fp8_gemm to fp8_gemm_triton, and the function matmul_fp8 to matmul_fp8_triton - Added a new entry point fp8_gemm with matmul_fp8 inside, and if the system supports triton it calls the triton implementation and if not it calls the fallback Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

References

#6916 - Add fp8_gemm fallback for non-triton systems

Author

oelayan7

Parents

f8c9f314

DeepSpeed c5e48f49 - Add fp8_gemm fallback for non-triton systems (#6916)

DeepSpeed
c5e48f49 - Add fp8_gemm fallback for non-triton systems (#6916)