llama.cpp
a4837577
- vulkan: use aligned loads for flash attention mask (#12853)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
249 days ago
vulkan: use aligned loads for flash attention mask (#12853) Rewrite the stride logic for the mask tensor in the FA shader to force the stride to be aligned, to allow using more efficient loads.
References
#12853 - vulkan: use aligned loads for flash attention mask
Author
jeffbolznv
Parents
e59ea539
Loading