DeepSpeed
adds triton flash attention2 kernel
#4337
Merged

adds triton flash attention2 kernel #4337

stephen-youn merged 14 commits into master from styoun/triton-flash2
stephen-youn
styoun initial commit
4318d771
styoun temp commit: needs debugging
95456d0e
styoun packed flash attn with mask works
2466fd9d
styoun clean-up
7e2e858e
styoun add bert/roberta tests to test_inference
b0d42400
styoun is_triton_supported added to Accelerator class
b6c47f6b
stephen-youn stephen-youn marked this pull request as ready for review 2 years ago
stephen-youn stephen-youn requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 2 years ago
stephen-youn stephen-youn requested a review from jeffra jeffra 2 years ago
stephen-youn stephen-youn requested a review from mrwyattii mrwyattii 2 years ago
stephen-youn stephen-youn requested a review from awan-10 awan-10 2 years ago
stephen-youn stephen-youn requested a review from cmikeh2 cmikeh2 2 years ago
stephen-youn stephen-youn requested a review from arashb arashb 2 years ago
stephen-youn stephen-youn requested a review from tjruwase tjruwase 2 years ago
styoun triton supports the flash attention when compute cap > 8.0
1ec42552
styoun formatting
26853e9d
styoun fix comments
f30f64ca
styoun cleanup
401b3d5a
lekurile
lekurile requested changes on 2023-09-14
styoun cleanup flash kernel
fae2ab9a
styoun Merge branch 'master' into styoun/triton-flash2
4bae6079
styoun fix according to the PR comment
4e6c1646
lekurile
lekurile approved these changes on 2023-09-20
lekurile Merge branch 'master' into styoun/triton-flash2
544f1ddc
stephen-youn stephen-youn enabled auto-merge 2 years ago
stephen-youn stephen-youn merged 0e0748c5 into master 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone