Add fp8 flash attention (#2382)
Summary:
python run_benchmark.py triton --op fp8_attention --only triton_flash_v2 --metrics tflops
Pull Request resolved: https://github.com/pytorch/benchmark/pull/2382
Reviewed By: xuzhao9
Differential Revision: D59970122
Pulled By: manman-ren
fbshipit-source-id: 1697a1f3e4ebae275403b677cc611d6b13e05d66