auto-round
support packing immediately for gguf to reduce ram usage
#638
Merged

support packing immediately for gguf to reduce ram usage #638

wenhuach21 merged 23 commits into main from gguf_packing_immediately
wenhuach21
wenhuach21 support
79af3b0c
wenhuach21 wenhuach21 changed the title support packing immediately for gguf [WIP]support packing immediately for gguf 180 days ago
wenhuach21 wenhuach21 marked this pull request as draft 180 days ago
wenhuach21 fix
6c53511f
wenhuach21 update
68eaa88e
wenhuach21 change
6f2e330f
wenhuach21 Merge branch 'main' into gguf_packing_immediately
b24bad34
wenhuach21 wenhuach21 marked this pull request as ready for review 179 days ago
wenhuach21 wenhuach21 changed the title [WIP]support packing immediately for gguf support packing immediately for gguf 179 days ago
wenhuach21 change
25292467
wenhuach21 fix some issues
41aa6c7b
wenhuach21 tmp change
387b382a
wenhuach21 mv gc
2f0495a6
wenhuach21 tmp fix
bf7e89e2
wenhuach21 fix bug
02500db5
n1ck-guo fix clean moe bug
1837f5d2
n1ck-guo merge
99ddcbc2
n1ck-guo clean
ecef6b36
wenhuach21 refine code
1e067b15
wenhuach21 clean code
b8faef37
wenhuach21 fix
73826b49
wenhuach21 wenhuach21 changed the title support packing immediately for gguf support packing immediately for gguf to reduce ram usage 179 days ago
wenhuach21 fix multiple gpu issue
18de9c82
WeiweiZhang1
WeiweiZhang1 approved these changes on 2025-07-04
wenhuach21
wenhuach21 fix typo
67972ebd
wenhuach21 fix deepseekv3 issue
5ce94d2c
wenhuach21
wenhuach21 fix line too long
51662817
wenhuach21 fix embedding mixed bits setting
4f98f4e7
wenhuach21 update version to 0.6.0
31a52240
wenhuach21 wenhuach21 merged 3b282dc9 into main 179 days ago
wenhuach21 wenhuach21 deleted the gguf_packing_immediately branch 179 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone