llama.cpp
d62b532c - Use model->gguf_kv for loading the template instead of using the C API. (#10868)

Commit
268 days ago
Use model->gguf_kv for loading the template instead of using the C API. (#10868) * Bump model_template to 16384 bytes to support larger chat templates. * Use `model->gguf_kv` for efficiency.
Author
Parents
Loading