llama.cpp
CUDA: batch out_prod inner loop with cublasSgemmStridedBatched
#22651
Merged

Loading