llama.cpp
cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization
#19624
Merged

cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization #19624

am17an merged 3 commits into ggml-org:master from dfriehs:iq2xxs-cuda
dfriehs
dfriehs dfriehs requested a review from JohannesGaessler JohannesGaessler 5 days ago
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
dfriehs cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization
be3a90c9
dfriehs dfriehs force pushed from da09e4ff to be3a90c9 5 days ago
dfriehs dfriehs changed the title cuda: optimize iq2xxs dequantization cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization 5 days ago
dfriehs cuda: iq2xxs: simplify sum scaling
dfc0e2c9
am17an
am17an commented on 2026-02-15
BrickBee
dfriehs uint -> uint32_t
b82a9807
am17an
am17an approved these changes on 2026-02-15
am17an am17an merged 27b93cbd into master 4 days ago
dfriehs dfriehs deleted the iq2xxs-cuda branch 4 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone