llama.cpp
1c07c0c6 - convert : handle compressed-tensors quant method (#17069)

Commit

121 days ago

convert : handle compressed-tensors quant method (#17069) * convert : handle compressed-tensors quant method * convert : handle int-quantized models * convert : handle naive-quantized models * gguf-py : __pos__ is also unary * convert : fix flake8 lint * convert : use F32 for dequant of pack-quantized tensors

References

#17069 - convert : handle compressed-tensors quant method

Author

compilade

Parents

cb1adf88

llama.cpp 1c07c0c6 - convert : handle compressed-tensors quant method (#17069)

llama.cpp
1c07c0c6 - convert : handle compressed-tensors quant method (#17069)