llama : use F32 precision in Qwen2 attention and no FA #8412
llama : use F32 precision in Qwen2 attention and no FA
7c9e9a22
ggerganov
merged
7a221b67
into master 1 year ago
ggerganov
deleted the gg/qwen2-f32-prec branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub