llama.cpp
convert : use reflinks for faster conversion
#15727
Open

convert : use reflinks for faster conversion #15727

compilade
compilade compilade added demo
compilade compilade added python
github-actions github-actions added ggml
compilade compilade force-pushed the compilade/convert-safetensors-parse branch from 786b32d8 to e582f1ac 155 days ago
compilade compilade force pushed to 833d03c2 155 days ago
compilade compilade force-pushed the compilade/convert-safetensors-parse branch from e582f1ac to e996f3ae 96 days ago
compilade convert : use reflinks for faster conversion
562aa42c
compilade convert : fix reflinks for stacked MoE tensors
d9210570
compilade gguf-py : fix flake8 lint
791bd97b
compilade convert : detect filesystem block size for reflinks
c3738cfc
compilade convert : use F32 operations on Mamba A_log
614b95a8
compilade convert : allow sharding reflinked models
d3fcb0e9
compilade gguf-py : improve reflink size logging
5712aa89
compilade convert : more robust default ftype detection
e097d98a
compilade convert : remove unused field ModelTensorInfo.src_qtype
3126b5ee
compilade gguf-py : allow previewing reflinked size on non-Linux platforms
6ffa46d8
compilade convert : better logging of partially reflinkable tensors
4be1a5d4
compilade gguf-py : handle cross-filesystem file range copies
f88a4b93
compilade convert : for FP8, use scale type to decide auto type
2ef41855
compilade compilade force pushed from 833d03c2 to 2ef41855 96 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone