CUDA: Quantized matrix matrix multiplication #2160
JohannesGaessler
changed the title Cuda matrix matrix 6 CUDA: Quantized matrix matrix multiplication 2 years ago
mmq implementation for non k-quants
ddb37bf8
q6_K
4b3af63e
q2_K
5bff3df0
q3_k
a62bcc89
q4_K
b59cd1dc
vdr
5d8b3de4
q5_K
b53e7138
faster q8_1 loading
a3505fac
loop unrolling
6808800c
add __restrict__
58daf95a
q2_K sc_high
abed4463
GGML_CUDA_MMQ_Y
3c09e11c
Updated Makefile
038ed631
Update Makefile
495c8981
DMMV_F16 -> F16
656c1ab3
Updated README, CMakeLists
aa4b2c93
slaren
commented
on 2023-07-29
Fix CMakeLists.txt
c0dfd5a5
Fix CMakeLists.txt
0b5f9891
Fix multi GPU out-of-bounds
0bb22bb4
slaren
approved these changes
on 2023-07-29
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub