support packing immediately for gguf to reduce ram usage #638
support
79af3b0c
wenhuach21
changed the title support packing immediately for gguf [WIP]support packing immediately for gguf 180 days ago
wenhuach21
marked this pull request as draft 180 days ago
fix
6c53511f
update
68eaa88e
change
6f2e330f
Merge branch 'main' into gguf_packing_immediately
b24bad34
wenhuach21
marked this pull request as ready for review 179 days ago
wenhuach21
changed the title [WIP]support packing immediately for gguf support packing immediately for gguf 179 days ago
change
25292467
fix some issues
41aa6c7b
tmp change
387b382a
mv gc
2f0495a6
tmp fix
bf7e89e2
fix bug
02500db5
fix clean moe bug
1837f5d2
merge
99ddcbc2
clean
ecef6b36
refine code
1e067b15
clean code
b8faef37
fix
73826b49
wenhuach21
changed the title support packing immediately for gguf support packing immediately for gguf to reduce ram usage 179 days ago
fix multiple gpu issue
18de9c82
fix typo
67972ebd
fix deepseekv3 issue
5ce94d2c
fix line too long
51662817
fix embedding mixed bits setting
4f98f4e7
update version to 0.6.0
31a52240
wenhuach21
deleted the gguf_packing_immediately branch 179 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub