llama.cpp
05872ac8 - convert : fix big-endian conversion (#17431)

Commit

18 days ago

convert : fix big-endian conversion (#17431) * Fix convert_hf_to_gguf.py script on s390x Assume converted model data is originally little-endian. Byteswap data on s390x after reading it to put values in correct presentation for any transformation needed, like calculating weight tensors. Then byteswap data to little-endian before passing it to GGUFWriter while GGUFWriter will byteswap data back to big endian if big endian output is requested. byteswap(inplace=True) calls don't work with lazy tensor and array wrappers. Use byteswap with copying data to workaround this behaviour. * Make GGUFWriter accept tensors in native endianness instead of little-endian With this change if no byteswapping is actually needed, 2 excessive byteswaps can be omitted on s390x * Fix byteswapping in convert_hf_to_gguf.py for remote models

References

#17431 - Fix convert_hf_to_gguf.py script on s390x

Author

AlekseiNikiforovIBM

Parents

55ab25ca

llama.cpp 05872ac8 - convert : fix big-endian conversion (#17431)

llama.cpp
05872ac8 - convert : fix big-endian conversion (#17431)