llama.cpp
Support requantizing models instead of only allowing quantization from 16/32bit
#1691

Merged

Support requantizing models instead of only allowing quantization from 16/32bit #1691

ggerganov merged 4 commits into ggml-org:master from KerfuffleV2:feat-quantize-quantized

KerfuffleV2 added enhancement

KerfuffleV2 removed enhancement

KerfuffleV2 added research 🔬

KerfuffleV2 force pushed 2 years ago

Add support for quantizing already quantized models

5c7b0e79

Threaded dequantizing and f16 to f32 conversion

fe9ed7d3

Clean up thread blocks with spares calculation a bit

7ed5aca9

Use std::runtime_error exceptions.

b3d605dc

KerfuffleV2 force pushed to b3d605dc 2 years ago

KerfuffleV2 changed the title ~~Add support for quantizing already quantized models~~ Support requantizing models instead of only allowing quantization from 16/32bit 2 years ago

ggerganov approved these changes on 2023-06-10

ggerganov merged 4f0154b0 into master 2 years ago

KerfuffleV2 deleted the feat-quantize-quantized branch 2 years ago

Reviewers

ggerganov

Assignees

No one assigned

Labels

research 🔬

Milestone

No milestone

llama.cpp Support requantizing models instead of only allowing quantization from 16/32bit #1691 Merged

Support requantizing models instead of only allowing quantization from 16/32bit #1691

llama.cpp
Support requantizing models instead of only allowing quantization from 16/32bit
#1691

Merged