llama.cpp
ad8207af - cuda : enable CUDA graphs for MMID 1 <= BS <= 4 (#19645)

Commit
94 days ago
cuda : enable CUDA graphs for MMID 1 <= BS <= 4 (#19645) * cuda : enable CUDA graphs for MMID BS <= 4 * cont : add stream capture check Co-authored-by: Oliver Simons <osimons@nvidia.com> * cont : add MMVQ_MMID_MAX_BATCH_SIZE --------- Co-authored-by: Oliver Simons <osimons@nvidia.com>
Author
Parents
Loading