llama.cpp
CUDA: use mma PTX instructions for FlashAttention
#11583
Merged

Loading