llama.cpp
bc219750 - speculative : fix handling of some input params (#9963)

Commit
1 year ago
speculative : fix handling of some input params (#9963) * speculative : fix batch sizes at initialization ggml-ci * speculative : handle params.n_predict == -1 * speculative : limit batch size to llama_n_batch
Author
Parents
Loading