Add flash attention v3 (#2381)

Commit

1 year ago

Add flash attention v3 (#2381) Summary: First, install flash attention hopper: ``` python install.py --userbenchmark triton --flash ``` Second, run the following command: ``` $ python run_benchmark.py triton --op flash_attention --only flash_v2,flash_v3 --num-inputs 1 --metrics tflops [00:00<00:00, 5.70it/s] SeqLen flash_v2-tflops flash_v3-tflops -------- ----------------- ----------------- 128 49.2482 33.825 ``` Pull Request resolved: https://github.com/pytorch/benchmark/pull/2381 Reviewed By: manman-ren Differential Revision: D59871536 Pulled By: xuzhao9 fbshipit-source-id: 23bf32d18bda5004bf614504e40d2c33ad8966d3

Author

xuzhao9

Committer

facebook-github-bot

Parents

429137dd

benchmark 6dfede78 - Add flash attention v3 (#2381)

benchmark
6dfede78 - Add flash attention v3 (#2381)