llama.cpp
449ec2ab - vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)

Commit
3 days ago
vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281) Write out a 2-bit code per block and avoid loading the mask when it matches these two common cases. Apply this optimization when the mask is relatively large (i.e. prompt processing).
Author
Parents
Loading