llama.cpp
imatrix : use GGUF to store importance matrices
#9400
Merged

imatrix : use GGUF to store importance matrices #9400

compilade merged 32 commits into master from compilade/imatrix-batched-chunks
compilade
compilade imatrix : allow processing multiple chunks per batch
bce54642
compilade imatrix : fix segfault when using a single chunk per batch
347247a2
compilade imatrix : use GGUF to store imatrix data
3de9300c
compilade imatrix : fix conversion problems
c8ab6a3b
compilade Merge branch 'master' into compilade/imatrix-batched-chunks
3ad0603c
compilade imatrix : use FMA and sort tensor names
d19101c9
compilade py : add requirements for legacy imatrix convert script
503630e8
compilade perplexity : revert changes
9e6b0e94
compilade compilade added enhancement
compilade compilade added breaking change
compilade compilade added refactoring
compilade compilade added examples
compilade compilade added python
compilade compilade added Review Complexity : Medium
compilade py : include imatrix converter requirements in toplevel requirements
894ed8d7
compilade imatrix : avoid using designated initializers in C++
efa9186d
compilade imatrix : remove unused n_entries
22172470
ngxson
ngxson commented on 2024-09-10
compilade imatrix : allow loading mis-ordered tensors
8c13e16b
compilade quantize : use unused imatrix chunk_size with LLAMA_TRACE
2d79a707
compilade compilade marked this pull request as draft 1 year ago
compilade
compilade common : use GGUF for imatrix output by default
c7a32e76
compilade Merge branch 'master' into compilade/imatrix-batched-chunks
db502ddd
compilade Merge branch 'master' into compilade/imatrix-batched-chunks
1be357d9
ggerganov
compilade Merge branch 'master' into compilade/imatrix-batched-chunks
16202d6f
compilade imatrix : two-way conversion between old format and GGUF
a5165a6c
compilade convert : remove imatrix to gguf python script
635f945e
compilade imatrix : use the function name in more error messages
1d190259
compilade Merge branch 'master' into compilade/imatrix-batched-chunks
2c094502
compilade imatrix : don't use FMA explicitly
ba6f6be6
compilade imatrix : avoid returning from void function save_imatrix
1a9454a3
compilade imatrix : support 3d tensors with MUL_MAT
43cd2b3e
compilade quantize : fix dataset name loading from gguf imatrix
0e793550
compilade Merge branch 'master' into compilade/imatrix-batched-chunks
118d52fe
compilade compilade marked this pull request as ready for review 136 days ago
CISC
CISC commented on 2025-06-23
compilade common : move string_remove_suffix from quantize and imatrix
e33de128
CISC
CISC approved these changes on 2025-06-24
JohannesGaessler
CISC
bartowski1182
danielhanchen
compilade
compilade
compilade commented on 2025-07-07
slaren
nicoboss
saood06
EAddario
compilade Merge branch 'master' into compilade/imatrix-batched-chunks
0ee322cd
compilade imatrix : add warning when legacy format is written
42423ec4
compilade imatrix : warn when writing partial data, to help guess dataset coverage
50f53b3e
compilade imatrix : avoid loading model to convert or combine imatrix
183eeb55
compilade imatrix : avoid using imatrix.dat in README
942c55cd
compilade
nicoboss
EAddario
CISC
ubergarm
CISC
EAddario
compilade
compilade
compilade commented on 2025-07-19
compilade compilade merged 90083283 into master 110 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone