Add fp8_gemm fallback for non-triton systems (#6916)
- Removed try/except from __init__ file in fp_quantizer and added a
single entry point instead
- Renamed file fp8_gemm to fp8_gemm_triton, and the function matmul_fp8
to matmul_fp8_triton
- Added a new entry point fp8_gemm with matmul_fp8 inside, and if the
system supports triton it calls the triton implementation and if not it
calls the fallback
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>