PR #9400 imatrix : use GGUF to store importance matrices

imatrix : use GGUF to store importance matrices #9400

compilade merged 32 commits into master from compilade/imatrix-batched-chunks

imatrix : allow processing multiple chunks per batch

bce54642

imatrix : fix segfault when using a single chunk per batch

347247a2

imatrix : use GGUF to store imatrix data

3de9300c

imatrix : fix conversion problems

c8ab6a3b

Merge branch 'master' into compilade/imatrix-batched-chunks

3ad0603c

imatrix : use FMA and sort tensor names

d19101c9

py : add requirements for legacy imatrix convert script

503630e8

perplexity : revert changes

9e6b0e94

compilade added enhancement

compilade added breaking change

compilade added refactoring

compilade added examples

compilade added python

compilade added Review Complexity : Medium

py : include imatrix converter requirements in toplevel requirements

894ed8d7

imatrix : avoid using designated initializers in C++

efa9186d

imatrix : remove unused n_entries

22172470

ngxson commented on 2024-09-10

imatrix : allow loading mis-ordered tensors

8c13e16b

quantize : use unused imatrix chunk_size with LLAMA_TRACE

2d79a707

compilade marked this pull request as draft 1 year ago

common : use GGUF for imatrix output by default

c7a32e76

Merge branch 'master' into compilade/imatrix-batched-chunks

db502ddd

Merge branch 'master' into compilade/imatrix-batched-chunks

1be357d9

Merge branch 'master' into compilade/imatrix-batched-chunks

16202d6f

imatrix : two-way conversion between old format and GGUF

a5165a6c

convert : remove imatrix to gguf python script

635f945e

imatrix : use the function name in more error messages

1d190259

Merge branch 'master' into compilade/imatrix-batched-chunks

2c094502

imatrix : don't use FMA explicitly

ba6f6be6

imatrix : avoid returning from void function save_imatrix

1a9454a3

imatrix : support 3d tensors with MUL_MAT

43cd2b3e

quantize : fix dataset name loading from gguf imatrix

0e793550

Merge branch 'master' into compilade/imatrix-batched-chunks

118d52fe

compilade marked this pull request as ready for review 265 days ago

CISC commented on 2025-06-23

common : move string_remove_suffix from quantize and imatrix

e33de128

CISC approved these changes on 2025-06-24

compilade commented on 2025-07-07

Merge branch 'master' into compilade/imatrix-batched-chunks

0ee322cd

imatrix : add warning when legacy format is written

42423ec4

imatrix : warn when writing partial data, to help guess dataset coverage

50f53b3e

imatrix : avoid loading model to convert or combine imatrix

183eeb55

imatrix : avoid using imatrix.dat in README

942c55cd

compilade commented on 2025-07-19

compilade merged 90083283 into master 239 days ago

Reviewers

CISC

ngxson

danielhanchen

jukofyork

Assignees

No one assigned

Labels

enhancement breaking change refactoring examples python Review Complexity : Medium

Milestone

No milestone

llama.cpp imatrix : use GGUF to store importance matrices #9400 Merged

imatrix : use GGUF to store importance matrices #9400

llama.cpp
imatrix : use GGUF to store importance matrices
#9400

Merged