llama.cpp
1d0331c1 - quantize: options for output and token embedding tensors qtype (#6239)

Commit

2 years ago

quantize: options for output and token embedding tensors qtype (#6239) * quantize: be able to specify the output tensor type * quantize: be able to specify the token embedding tensor type --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>

References

#6239 - quantize: be able to explicitly specify quantization type of output and token embedding tensors

Author

ikawrakow

Parents

dba1af61

llama.cpp 1d0331c1 - quantize: options for output and token embedding tensors qtype (#6239)

llama.cpp
1d0331c1 - quantize: options for output and token embedding tensors qtype (#6239)