llama.cpp
0a423800 - CUDA: revert part of the RDNA1 optimizations (#8309)

Commit
342 days ago
CUDA: revert part of the RDNA1 optimizations (#8309) The change on the launch_bounds was causing a small performance drop in perplexity of 25 t/s
Author
Parents
  • ggml/src/ggml-cuda
    • File
      mmq.cuh