[Qwen] avoid creating attention masks when there is no padding #12987
avoid creating attention masks when there is no padding
661febb6
make fix-copies
5507b5e8
Merge branch 'main' into fix-reg
cd85aae7
torch compile tests
4839fcfc
set all ones mask to none
23150e46
fix positional encoding from becoming > 4096
47f6585e
fix from review
da6e128e
slice freqs_cis to match the input sequence length
283df923
Merge branch 'main' into fix-reg
19d9d092
keep only attenton masking change
3a0fd2db
Merge branch 'main' into fix-reg
0f70ef5a
yiyixuxu
approved these changes
on 2026-01-27
yiyixuxu
merged
d54669a7
into main 42 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub