llama.cpp
a20b2b05 - context : round n_tokens to next multiple of n_seqs when reserving (#14140)

Commit
87 days ago
context : round n_tokens to next multiple of n_seqs when reserving (#14140) This fixes RWKV inference which otherwise failed when the worst case ubatch.n_seq_tokens rounded to 0.
Author
Parents
Loading