llama.cpp
96633eec - gemma : use more bits for the token_embd.weight tensor (#5650)

Commit

1 year ago

gemma : use more bits for the token_embd.weight tensor (#5650) * gemma : use Q8_0 for the token_embd.weight tensor * llama : quantize token_embd.weight using output type

References

#5650 - gemma : use more bits for the token_embd.weight tensor

Author

ggerganov

Parents

847eedbd

llama.cpp 96633eec - gemma : use more bits for the token_embd.weight tensor (#5650)

llama.cpp
96633eec - gemma : use more bits for the token_embd.weight tensor (#5650)