llama.cpp
0fc1e820 - CUDA: faster large batch FA without tensor cores (#7314)

Commit
1 year ago
CUDA: faster large batch FA without tensor cores (#7314)
Parents
Loading