llama.cpp
7a221b67 - llama : use F32 precision in Qwen2 attention and no FA (#8412)

Commit

1 year ago

llama : use F32 precision in Qwen2 attention and no FA (#8412)

References

#8412 - llama : use F32 precision in Qwen2 attention and no FA

Author

ggerganov

ggerganov

Parents

Loading