onnxruntime
7d9b12a2
- [CPU] SparseAttention op (#21110)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
[CPU] SparseAttention op (#21110) Add SparseAttention cpu implementation. - [x] Refactoring GQAAttentionBase - [x] Add SparseAttention implementation - [x] Add test cases This is unfused version. Flash attention version will be added later.
References
#21110 - [CPU] SparseAttention op
Author
tianleiwu
Parents
30b6e82e
Loading