llama.cpp
95930da3 - convert-hf : get bit-exact same output as ./quantize

Commit
1 year ago
convert-hf : get bit-exact same output as ./quantize The quantization version was missing. * convert-hf : don't round bf16 NANs * convert-hf : save some memory with np.int16 intermediate bf16 weights * convert-hf : more closely match llama.cpp with which weights to keep in f32
Author
Committer
Parents
Loading