PR #16962 CUDA: add head size 72 for flash-attn

CUDA: add head size 72 for flash-attn #16962

ngxson merged 1 commit into ggml-org:master from theo77186:fattn-hs72

CUDA: add head size 72

72545ce2

theo77186 requested a review from

JohannesGaessler 247 days ago

JohannesGaessler approved these changes on 2025-11-03

ngxson approved these changes on 2025-11-03

github-actions added Nvidia GPU

github-actions added python

github-actions added ggml

ngxson merged 622cd010 into master 247 days ago

theo77186 deleted the fattn-hs72 branch 247 days ago

Reviewers

ngxson

JohannesGaessler

Assignees

No one assigned

Labels

Nvidia GPU python ggml

Milestone

No milestone