llama.cpp
4b6fb652 - context : round n_tokens to next multiple of n_seqs when reserving

Commit
242 days ago
context : round n_tokens to next multiple of n_seqs when reserving This fixes RWKV inference which fails when ubatch.n_seq_tokens is 0.
Author
Committer
Parents
Loading