llama.cpp
Support requantizing models instead of only allowing quantization from 16/32bit
#1691
Merged

Support requantizing models instead of only allowing quantization from 16/32bit #1691

KerfuffleV2
KerfuffleV2 KerfuffleV2 added enhancement
KerfuffleV2 KerfuffleV2 removed enhancement
KerfuffleV2
KerfuffleV2 KerfuffleV2 added research 🔬
KerfuffleV2 KerfuffleV2 force pushed 2 years ago
KerfuffleV2 Add support for quantizing already quantized models
5c7b0e79
KerfuffleV2 Threaded dequantizing and f16 to f32 conversion
fe9ed7d3
KerfuffleV2 Clean up thread blocks with spares calculation a bit
7ed5aca9
KerfuffleV2 Use std::runtime_error exceptions.
b3d605dc
KerfuffleV2 KerfuffleV2 force pushed to b3d605dc 2 years ago
KerfuffleV2 KerfuffleV2 changed the title Add support for quantizing already quantized models Support requantizing models instead of only allowing quantization from 16/32bit 2 years ago
maxxk
ggerganov
ggerganov approved these changes on 2023-06-10
ggerganov ggerganov merged 4f0154b0 into master 2 years ago
KerfuffleV2
KerfuffleV2 KerfuffleV2 deleted the feat-quantize-quantized branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone