llama.cpp
c32fa21d - sampling: reuse token data buffer in llama_sampler_sample (#18365)

Commit
1 day ago
sampling: reuse token data buffer in llama_sampler_sample (#18365) * sampling: reuse token data buffer in llama_sampler_sample * move cur buffer before timing section, after samplers * minor : fix build --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Author
Parents
Loading