llama.cpp
CUDA: use mma FA kernel for gqa > 4 on RTX 4000
#15035
Merged

Commits
  • CUDA: use mma FA kernel for gqa > 4 on RTX 4000
    JohannesGaessler committed 219 days ago
Loading