llama.cpp
cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (#10976)
#12000

Merged

cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (#10976) #12000

JohannesGaessler merged 1 commit into ggml-org:master from gcp:cpy_cuda_quants

github-actions added Nvidia GPU

github-actions added ggml

JohannesGaessler commented on 2025-02-21

gcp force pushed 303 days ago

cuda: Add Q5_1, Q5_0, Q4_1 and Q4_0 to F32 conversion support. (#10976)

295573fc

gcp force pushed to 295573fc 303 days ago

JohannesGaessler approved these changes on 2025-02-22

JohannesGaessler merged d7090842 into master 302 days ago

Reviewers

JohannesGaessler

Assignees

No one assigned

Labels

Nvidia GPU ggml

Milestone

No milestone