llama.cpp
llama : use F32 precision in Qwen2 attention and no FA
#8412

Merged

llama : use F32 precision in Qwen2 attention and no FA #8412

ggerganov merged 1 commit into master from gg/qwen2-f32-prec

llama : use F32 precision in Qwen2 attention and no FA

7c9e9a22

JohannesGaessler approved these changes on 2024-07-10

ggerganov merged 7a221b67 into master 1 year ago

ggerganov deleted the gg/qwen2-f32-prec branch 1 year ago

Reviewers

JohannesGaessler

Assignees

No one assigned

Labels

None yet

Milestone

No milestone