transformers
0327d0f7
- [performance_optim] define flash attention mask on NPU device directly (#37698)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
334 days ago
[performance_optim] define flash attention mask on NPU device directly (#37698) Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
References
#37698 - [performance_optim] define flash attention mask on NPU device directly
Author
FightingZhen
Parents
14e28bd7
Loading