onnxruntime
0dc8613d
- Merge branch 'flash_v2_packed_mha' into flash_v2_no_cuda52
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
Merge branch 'flash_v2_packed_mha' into flash_v2_no_cuda52 Nuget Fix
References
#17674 - [CUDA] GroupQueryAttention operator using FlashAttention
Author
aciddelgado
Parents
e7b7f2e9
f7601235
Loading