llama.cpp
96633eec - gemma : use more bits for the token_embd.weight tensor (#5650)

Commit
1 year ago
gemma : use more bits for the token_embd.weight tensor (#5650) * gemma : use Q8_0 for the token_embd.weight tensor * llama : quantize token_embd.weight using output type
Author
Parents
Loading