llama.cpp
1c07c0c6 - convert : handle compressed-tensors quant method (#17069)

Commit
71 days ago
convert : handle compressed-tensors quant method (#17069) * convert : handle compressed-tensors quant method * convert : handle int-quantized models * convert : handle naive-quantized models * gguf-py : __pos__ is also unary * convert : fix flake8 lint * convert : use F32 for dequant of pack-quantized tensors
Author
Parents
Loading