Support requantizing models instead of only allowing quantization from 16/32bit #1691
Add support for quantizing already quantized models
5c7b0e79
Threaded dequantizing and f16 to f32 conversion
fe9ed7d3
Clean up thread blocks with spares calculation a bit
7ed5aca9
Use std::runtime_error exceptions.
b3d605dc
KerfuffleV2
changed the title Add support for quantizing already quantized models Support requantizing models instead of only allowing quantization from 16/32bit 2 years ago
ggerganov
approved these changes
on 2023-06-10
ggerganov
merged
4f0154b0
into master 2 years ago
KerfuffleV2
deleted the feat-quantize-quantized branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub