llama.cpp
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
#12177

Merged

Commits

refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.

IMbackK committed 197 days ago
make cuda happy, as it dosent support calling host constexpr functions in device code, even though that should not be a problem.

IMbackK committed 197 days ago
Fix nits

IMbackK committed 194 days ago
Fix spelling of parameter

IMbackK committed 194 days ago
Update ggml/src/ggml-cuda/mmvq.cu

IMbackK committed 194 days ago