onnxruntime
7d9b12a2 - [CPU] SparseAttention op (#21110)

Commit
1 year ago
[CPU] SparseAttention op (#21110) Add SparseAttention cpu implementation. - [x] Refactoring GQAAttentionBase - [x] Add SparseAttention implementation - [x] Add test cases This is unfused version. Flash attention version will be added later.
Author
Parents
Loading