llama.cpp
bc219750 - speculative : fix handling of some input params (#9963)

Commit

1 year ago

speculative : fix handling of some input params (#9963) * speculative : fix batch sizes at initialization ggml-ci * speculative : handle params.n_predict == -1 * speculative : limit batch size to llama_n_batch

References

#9963 - speculative : fix batch sizes at initialization

Author

ggerganov

Parents

1db8c84f

llama.cpp bc219750 - speculative : fix handling of some input params (#9963)

llama.cpp
bc219750 - speculative : fix handling of some input params (#9963)