llama.cpp
CUDA: optimize and refactor MMQ
#8416
Merged

CUDA: optimize and refactor MMQ #8416

JohannesGaessler
JohannesGaessler CUDA: optimize and refactor MMQ
f4b8df49
github-actions github-actions added Nvidia GPU
JohannesGaessler JohannesGaessler added Review Complexity : High
slaren
JohannesGaessler
slaren
JohannesGaessler explicit q8_1 memory layouts, add documentation
3c80cddb
slaren
slaren approved these changes on 2024-07-11
JohannesGaessler JohannesGaessler merged 808aba39 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone