CUDA: add head size 72 for flash-attn #16962
CUDA: add head size 72
72545ce2
ngxson
approved these changes
on 2025-11-03
ngxson
merged
622cd010
into master 212 days ago
theo77186
deleted the fattn-hs72 branch 212 days ago
Assignees
No one assigned
Labels
Nvidia GPU
python
ggml
Login to write a write a comment.
Login via GitHub