llama.cpp
imatrix : use GGUF to store importance matrices
#9400
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
32
Changes
View On
GitHub
imatrix : use GGUF to store importance matrices
#9400
compilade
merged 32 commits into
master
from
compilade/imatrix-batched-chunks
imatrix : allow processing multiple chunks per batch
bce54642
imatrix : fix segfault when using a single chunk per batch
347247a2
imatrix : use GGUF to store imatrix data
3de9300c
imatrix : fix conversion problems
c8ab6a3b
Merge branch 'master' into compilade/imatrix-batched-chunks
3ad0603c
imatrix : use FMA and sort tensor names
d19101c9
py : add requirements for legacy imatrix convert script
503630e8
perplexity : revert changes
9e6b0e94
compilade
added
enhancement
compilade
added
breaking change
compilade
added
refactoring
compilade
added
examples
compilade
added
python
compilade
added
Review Complexity : Medium
py : include imatrix converter requirements in toplevel requirements
894ed8d7
imatrix : avoid using designated initializers in C++
efa9186d
imatrix : remove unused n_entries
22172470
ngxson
commented on 2024-09-10
imatrix : allow loading mis-ordered tensors
8c13e16b
quantize : use unused imatrix chunk_size with LLAMA_TRACE
2d79a707
compilade
marked this pull request as draft
1 year ago
common : use GGUF for imatrix output by default
c7a32e76
Merge branch 'master' into compilade/imatrix-batched-chunks
db502ddd
Merge branch 'master' into compilade/imatrix-batched-chunks
1be357d9
Merge branch 'master' into compilade/imatrix-batched-chunks
16202d6f
imatrix : two-way conversion between old format and GGUF
a5165a6c
convert : remove imatrix to gguf python script
635f945e
imatrix : use the function name in more error messages
1d190259
Merge branch 'master' into compilade/imatrix-batched-chunks
2c094502
imatrix : don't use FMA explicitly
ba6f6be6
imatrix : avoid returning from void function save_imatrix
1a9454a3
imatrix : support 3d tensors with MUL_MAT
43cd2b3e
quantize : fix dataset name loading from gguf imatrix
0e793550
Merge branch 'master' into compilade/imatrix-batched-chunks
118d52fe
compilade
marked this pull request as ready for review
136 days ago
CISC
commented on 2025-06-23
common : move string_remove_suffix from quantize and imatrix
e33de128
CISC
approved these changes on 2025-06-24
compilade
commented on 2025-07-07
Merge branch 'master' into compilade/imatrix-batched-chunks
0ee322cd
imatrix : add warning when legacy format is written
42423ec4
imatrix : warn when writing partial data, to help guess dataset coverage
50f53b3e
imatrix : avoid loading model to convert or combine imatrix
183eeb55
imatrix : avoid using imatrix.dat in README
942c55cd
compilade
commented on 2025-07-19
compilade
merged
90083283
into master
110 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
CISC
ngxson
danielhanchen
jukofyork
Assignees
No one assigned
Labels
enhancement
breaking change
refactoring
examples
python
Review Complexity : Medium
Milestone
No milestone
Login to write a write a comment.
Login via GitHub