llama.cpp
a20b2b05
- context : round n_tokens to next multiple of n_seqs when reserving (#14140)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
87 days ago
context : round n_tokens to next multiple of n_seqs when reserving (#14140) This fixes RWKV inference which otherwise failed when the worst case ubatch.n_seq_tokens rounded to 0.
References
#14140 - context : round n_tokens to next multiple of n_seqs when reserving
Author
compilade
Parents
2e89f76b
Loading