onnxruntime
Support flash attention on 2d attention mask for gpt2 left padding.
#14215
Merged

Support flash attention on 2d attention mask for gpt2 left padding. #14215

zhanghuanrong
zhanghuanrong Support flash attention on 2d attention mask for gpt2 left padding.
f3f554de
tianleiwu
tianleiwu commented on 2023-01-10
tianleiwu
tianleiwu commented on 2023-01-10
zhanghuanrong Support both left padding and right padding for attention mask
f5f3df14
zhanghuanrong use AttentionMaskType::MASK_2D_KEY_PADDING
39c4e602
tianleiwu
tianleiwu commented on 2023-01-10
tianleiwu
tianleiwu commented on 2023-01-10
tianleiwu
tianleiwu commented on 2023-01-10
tianleiwu
tianleiwu commented on 2023-01-10
zhanghuanrong fix typo
8e4c5260
zhanghuanrong save reduce. pass build.
4c64c910
tianleiwu
tianleiwu commented on 2023-01-12
zhanghuanrong update the support after offline testing.
4432b15c
tianleiwu
tianleiwu approved these changes on 2023-01-17
yufenglee yufenglee added release:1.14
zhanghuanrong zhanghuanrong merged a8df6c35 into main 2 years ago
zhanghuanrong zhanghuanrong deleted the zhalei/enable2dmaskForFusedMultiHeadAttention branch 2 years ago
faxu faxu removed release:1.14

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone