llama.cpp
5.5x more CUDA performance with 5 minutes of work
#2140

Merged

5.5x more CUDA performance with 5 minutes of work #2140

JohannesGaessler merged 1 commit into ggml-org:master from JohannesGaessler:cuda-mmvq-pascal

CUDA: add __restrict__ to mul mat vec kernels

c8abd83c

slaren approved these changes on 2023-07-07

JohannesGaessler merged 061f5f8d into master 2 years ago

Reviewers

slaren

Assignees

No one assigned

Labels

None yet

Milestone

No milestone