llama.cpp
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
#12177
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
5
Changes
View On
GitHub
Commits
refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
IMbackK
committed
197 days ago
make cuda happy, as it dosent support calling host constexpr functions in device code, even though that should not be a problem.
IMbackK
committed
197 days ago
Fix nits
IMbackK
committed
194 days ago
Fix spelling of parameter
IMbackK
committed
194 days ago
Update ggml/src/ggml-cuda/mmvq.cu
IMbackK
committed
194 days ago
Loading