PR #5370 CUDA: mul_mat_vec_q max. batch size 8 -> 4

CUDA: mul_mat_vec_q max. batch size 8 -> 4 #5370

ggerganov merged 1 commit into ggml-org:master from JohannesGaessler:cuda-mmvq-limit

CUDA: mul_mat_vec_q max. batch size 8 -> 4

4e1d68b3

ggerganov approved these changes on 2024-02-06

ggerganov merged 17c97fb0 into master 2 years ago

Reviewers

ggerganov

Assignees

No one assigned

Labels

None yet

Milestone

No milestone