llama.cpp
cb40dfca - llama : only use Q6_K for output weights if tensor size is multiple of 256 (#1932)

Commit

2 years ago

llama : only use Q6_K for output weights if tensor size is multiple of 256 (#1932) * Only use Q6_K for output weights if tensor size is multiple of 256 * Fixed copy/paste mistake --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

References

#1932 - Only use Q6_K for output weights if tensor size is multiple of 256

Author

ikawrakow

Parents

ca7c3f4d

llama.cpp cb40dfca - llama : only use Q6_K for output weights if tensor size is multiple of 256 (#1932)

llama.cpp
cb40dfca - llama : only use Q6_K for output weights if tensor size is multiple of 256 (#1932)