SemanticDiff

pytorch
0fc02dbb - flash_attention integration (#81434)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

2 years ago

flash_attention integration (#81434) # Summary: - I added a new submodule Cutlass pointing to 2.10 release. The inclusion of flash_attention code should be gated by the flag: USE_FLASH_ATTENTION. This is defaulted to off resulting in flash to not be build anywhere. This is done on purpose since we don't have A100 machines to compile and test on. - Only looked at CMake did not attempt bazel or buck yet. - I included the mha_fwd from flash_attention that has ben refactored to use cutlass 2.10. There is currently no backwards kernel on this branch. That would be a good follow up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81434 Approved by: https://github.com/cpuhrsch

Author

drisspg

drisspg

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading