llama.cpp
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
#12177
Merged

CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. #12177

IMbackK merged 5 commits into ggml-org:master from IMbackK:refactor_mmqv
IMbackK
IMbackK IMbackK requested a review from JohannesGaessler JohannesGaessler 197 days ago
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
IMbackK refractor mmqv to unify the calculation of nwarps and rows per block …
888ffc87
IMbackK IMbackK force pushed from 50d4277c to 888ffc87 197 days ago
IMbackK make cuda happy, as it dosent support calling host constexpr function…
a55d765d
JohannesGaessler
JohannesGaessler commented on 2025-03-06
IMbackK Fix nits
b85a723a
IMbackK Fix spelling of parameter
15f4dcaf
IMbackK IMbackK requested a review from JohannesGaessler JohannesGaessler 194 days ago
JohannesGaessler
JohannesGaessler approved these changes on 2025-03-07
IMbackK Update ggml/src/ggml-cuda/mmvq.cu
1b3894eb
IMbackK IMbackK merged 10f2e818 into master 189 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone