llama.cpp
622cd010
- ggml: CUDA: add head size 72 for flash-attn (#16962)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 days ago
ggml: CUDA: add head size 72 for flash-attn (#16962)
References
#16962 - CUDA: add head size 72 for flash-attn
Author
theo77186
Parents
070ff4d5
Loading