CUDA: attention sinks for mma FlashAttention #15157
CUDA: attention sinks for mma FlashAttention
e95d0430
slaren
approved these changes
on 2025-08-07
ggerganov
approved these changes
on 2025-08-08
am17an
commented
on 2025-08-08
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub