llama.cpp
Implement '--keep-split' to quantize model into several shards
#6688
Merged

Implement '--keep-split' to quantize model into several shards #6688

zj040045
Implement '--keep-split' to quantize model into several shards
17519e11
phymbert
phymbert phymbert added split
Add test script
79bbf424
zj040045
phymbert phymbert requested a review from ggerganov ggerganov 1 year ago
github-actions
phymbert
phymbert commented on 2024-04-18
ggerganov
ggerganov commented on 2024-04-19
zj040045 Update examples/quantize/quantize.cpp
6d66e609
Split model correctly even if tensor id is out-of-order
d6e453eb
Update llama_model_quantize_params
141eb510
Fix preci failures
e0a3679a
ggerganov
ggerganov approved these changes on 2024-04-25
ggerganov ggerganov merged 1966eb26 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone