llama.cpp
CUDA: mul_mat_vec_q max. batch size 8 -> 4
#5370
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
1
Changes
View On
GitHub
CUDA: mul_mat_vec_q max. batch size 8 -> 4
#5370
ggerganov
merged 1 commit into
ggml-org:master
from
JohannesGaessler:cuda-mmvq-limit
CUDA: mul_mat_vec_q max. batch size 8 -> 4
4e1d68b3
ggerganov
approved these changes on 2024-02-06
ggerganov
merged
17c97fb0
into master
2 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub