transformers
3c289e21
- [performance_optim] reduce frequency of declaring attention_mask in Ascend NPU flash attention (#38278)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
304 days ago
[performance_optim] reduce frequency of declaring attention_mask in Ascend NPU flash attention (#38278) [performance_optim] reduce frequency of declaring attention_mask in ASCEND NPU flash attention
References
#38278 - [performance_optim] reduce frequency of declaring attention_mask in Ascend NPU flash attention
Author
FightingZhen
Parents
f5d45d89
Loading