llama.cpp
eef375ce - sampling : remove sampling branching in output_reserve (#18811)

Commit

14 days ago

sampling : remove sampling branching in output_reserve (#18811) * sampling : remove sampling branching in output_reserve This commit updates output_reserve in llama-context.cpp to always allocate sampling buffers regardless of whether sampling is needed for the current batch. The motivation for this is to avoid reallocations and branching based on the sampling requirements of the batch.

References

#18811 - sampling : remove sampling branching in output_reserve

Author

danbev

Parents

06961e28

llama.cpp eef375ce - sampling : remove sampling branching in output_reserve (#18811)

llama.cpp
eef375ce - sampling : remove sampling branching in output_reserve (#18811)