llama.cpp
96892952 - mmq.cu: tune mmq/rocblas switching for RDNA (#18537)

Commit

17 days ago

mmq.cu: tune mmq/rocblas switching for RDNA (#18537) * Patch perf regression for mmq kernels in ROCm recover performance regression for https://github.com/ggml-org/llama.cpp/issues/17917 * add n_experts branch like the cdna path * mmq.cu: tune mmq/wmma switching for RDNA * mmq.cu: move amd wmma mmq/wmma switching behind IS_RDNA3 * Update ggml/src/ggml-cuda/mmq.cu Co-authored-by: Johannes Gäßler <johannesg@5d6.de> --------- Co-authored-by: Jiacheng (Jason) Chen <76919340+jiachengjason@users.noreply.github.com> Co-authored-by: jiachengjason <jasonchen.jiacheng@gmail.com> Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

References

#18537 - mmq.cu: tune mmq/rocblas switching for RDNA

Author

Beinsezii

Parents

3d26a09d

llama.cpp 96892952 - mmq.cu: tune mmq/rocblas switching for RDNA (#18537)

llama.cpp
96892952 - mmq.cu: tune mmq/rocblas switching for RDNA (#18537)