Support flash attention on 2d attention mask for gpt2 left padding. #14215
Support flash attention on 2d attention mask for gpt2 left padding.
f3f554de
Support both left padding and right padding for attention mask
f5f3df14
use AttentionMaskType::MASK_2D_KEY_PADDING
39c4e602
fix typo
8e4c5260
save reduce. pass build.
4c64c910
update the support after offline testing.
4432b15c
tianleiwu
approved these changes
on 2023-01-17
zhanghuanrong
deleted the zhalei/enable2dmaskForFusedMultiHeadAttention branch 2 years ago
faxu
removed release:1.14
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub