llama.cpp
a20b2b05 - context : round n_tokens to next multiple of n_seqs when reserving (#14140)

Commit

121 days ago

context : round n_tokens to next multiple of n_seqs when reserving (#14140) This fixes RWKV inference which otherwise failed when the worst case ubatch.n_seq_tokens rounded to 0.

References

#14140 - context : round n_tokens to next multiple of n_seqs when reserving

Author

compilade

Parents

2e89f76b

llama.cpp a20b2b05 - context : round n_tokens to next multiple of n_seqs when reserving (#14140)

llama.cpp
a20b2b05 - context : round n_tokens to next multiple of n_seqs when reserving (#14140)