llama.cpp
bc78bf4c - convert-hf : faster model parts loading

Commit
1 year ago
convert-hf : faster model parts loading Instead of pre-loading them all into a dict, iterate on the tensors in the model parts progressively as needed in Model.write_tensors Conversion for some architectures relies on checking for the presence of specific tensor names, so for multi-part models, the weight map is read from the relevant json file to quickly get these names up-front.
Author
Committer
Parents
Loading