llama.cpp
021cc28b - cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (#14741)

Commit

359 days ago

cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (#14741) * Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs Gemma3n uses Matrix-Matrix addition as part of their input processing, wrongly triggering CUDA_GRAPH disablement on NVGPUs even when batch-size of 1 is used. * Exclude `project_per_layer_input` by matching node names This ensures that all other graphs which don't exhibit this pattern do not have their behavior changed. * Revert unnecessary formatting changes

References

#14741 - Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs

Author

ORippler

Parents

d498af3d

llama.cpp 021cc28b - cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (#14741)

llama.cpp
021cc28b - cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (#14741)