diffusers
d54669a7 - [Qwen] avoid creating attention masks when there is no padding (#12987)

Commit

30 days ago

[Qwen] avoid creating attention masks when there is no padding (#12987) * avoid creating attention masks when there is no padding * make fix-copies * torch compile tests * set all ones mask to none * fix positional encoding from becoming > 4096 * fix from review * slice freqs_cis to match the input sequence length * keep only attenton masking change --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

References

#12987 - [Qwen] avoid creating attention masks when there is no padding

Author

kashif

Parents

22ac6fae

diffusers d54669a7 - [Qwen] avoid creating attention masks when there is no padding (#12987)

diffusers
d54669a7 - [Qwen] avoid creating attention masks when there is no padding (#12987)