llama.cpp
eef375ce - sampling : remove sampling branching in output_reserve (#18811)

Commit
14 days ago
sampling : remove sampling branching in output_reserve (#18811) * sampling : remove sampling branching in output_reserve This commit updates output_reserve in llama-context.cpp to always allocate sampling buffers regardless of whether sampling is needed for the current batch. The motivation for this is to avoid reallocations and branching based on the sampling requirements of the batch.
Author
Parents
Loading