llama.cpp
3a14e003 - gguf-py : simplify support for quant types (#8838)

Commit

1 year ago

gguf-py : simplify support for quant types (#8838) * gguf-py : use classes for quants * convert_hf : simplify internal quantization type selection * gguf-py : fix flake8 lint * gguf-py : fix BF16 numpy view type * gguf-py : remove LlamaFileTypeMap Too specific to 'llama.cpp', and would be a maintenance burden to keep up to date. * gguf-py : add generic quantize and dequantize functions The quant classes no longer need to be known, only the target or the source type, for 'quantize' and 'dequantize', respectively.

References

#8838 - gguf-py : simplify support for quant types

Author

compilade

Parents

afd27f01

llama.cpp 3a14e003 - gguf-py : simplify support for quant types (#8838)

llama.cpp
3a14e003 - gguf-py : simplify support for quant types (#8838)