llama.cpp
1966eb26 - quantize : add '--keep-split' to quantize model into shards (#6688)

Commit
1 year ago
quantize : add '--keep-split' to quantize model into shards (#6688) * Implement '--keep-split' to quantize model into several shards * Add test script * Update examples/quantize/quantize.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Split model correctly even if tensor id is out-of-order * Update llama_model_quantize_params * Fix preci failures --------- Co-authored-by: z5269887 <z5269887@unsw.edu.au> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Author
Parents
Loading