llama.cpp
llama : use F32 precision in Qwen2 attention and no FA
#8412

Merged

Commits

llama : use F32 precision in Qwen2 attention and no FA

ggerganov committed 1 year ago

Loading