Added support of NANOO fp8.
There are several different genres of fp8 formats used by different
HW vendors. Two popular genres include
- OCP fp8, which is used natively on NVIDIA H100
- NANOO fp8, which is used natively on AMD MI300 and Graphcore HW.
These two genres of fp8 formats work very similarly. This PR is to
enable support of NANOO fp8 as it is also now supported in JAX and XLA.
References:
- OCP fp8 paper: https://arxiv.org/abs/2209.05433
- NANOO fp8 paper: https://arxiv.org/abs/2206.02915
- JAX PR: https://github.com/google/jax/pull/21376
- XLA PR: https://github.com/openxla/xla/pull/9531