llama.cpp
622cd010 - ggml: CUDA: add head size 72 for flash-attn (#16962)

Commit
2 days ago
ggml: CUDA: add head size 72 for flash-attn (#16962)
Author
Parents
Loading