llama.cpp
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
#12177
Merged

Commits
  • refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
    IMbackK committed 197 days ago
  • make cuda happy, as it dosent support calling host constexpr functions in device code, even though that should not be a problem.
    IMbackK committed 197 days ago
  • Fix nits
    IMbackK committed 194 days ago
  • Fix spelling of parameter
    IMbackK committed 194 days ago
  • Update ggml/src/ggml-cuda/mmvq.cu
    IMbackK committed 194 days ago
Loading