llama.cpp
1966eb26 - quantize : add '--keep-split' to quantize model into shards (#6688)

Commit

1 year ago

quantize : add '--keep-split' to quantize model into shards (#6688) * Implement '--keep-split' to quantize model into several shards * Add test script * Update examples/quantize/quantize.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Split model correctly even if tensor id is out-of-order * Update llama_model_quantize_params * Fix preci failures --------- Co-authored-by: z5269887 <z5269887@unsw.edu.au> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

References

#6688 - Implement '--keep-split' to quantize model into several shards

Author

zj040045

Parents

784e11de

llama.cpp 1966eb26 - quantize : add '--keep-split' to quantize model into shards (#6688)

llama.cpp
1966eb26 - quantize : add '--keep-split' to quantize model into shards (#6688)