llama.cpp
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
#12177

Merged

CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code. #12177

IMbackK merged 5 commits into ggml-org:master from IMbackK:refactor_mmqv

IMbackK requested a review from

JohannesGaessler 1 year ago

github-actions added Nvidia GPU

github-actions added ggml

refractor mmqv to unify the calculation of nwarps and rows per block …

888ffc87

IMbackK force pushed to 888ffc87 1 year ago

make cuda happy, as it dosent support calling host constexpr function…

a55d765d

JohannesGaessler commented on 2025-03-06

Fix nits

b85a723a

Fix spelling of parameter

15f4dcaf

IMbackK requested a review from

JohannesGaessler 1 year ago

JohannesGaessler approved these changes on 2025-03-07

Update ggml/src/ggml-cuda/mmvq.cu

1b3894eb

IMbackK merged 10f2e818 into master 1 year ago

Reviewers

JohannesGaessler

Assignees

No one assigned

Labels

Nvidia GPU ggml

Milestone

No milestone