llama.cpp
34c9d765 - CUDA: add attention sinks for tile and wmma (#15178)

Commit
29 days ago
CUDA: add attention sinks for tile and wmma (#15178) * CUDA: add attention sinks for tile and wmma * Review: formatting changes + remove syncthreads from tile + remove warp_reduce_max from wmma
Author
Parents
Loading