llama.cpp
llama : fix FA when KV cache is not used (i.e. embeddings)
#12825
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
3
Changes
View On
GitHub
llama : fix FA when KV cache is not used (i.e. embeddings)
#12825
ggerganov
merged 3 commits into
master
from
gg/embd-fix-fa
ggml : FA supports F32 V
3e6d1e4e
graph : cast KV to F16 when the KV cache is not used
7cb9ae05
server : add test that exercises embeddings with FA enabled
997b1b42
ggerganov
requested a review
from
ngxson
153 days ago
github-actions
added
examples
github-actions
added
python
github-actions
added
server
github-actions
added
ggml
github-actions
added
Apple Metal
ngxson
approved these changes on 2025-04-08
ggerganov
merged
a19b5cef
into master
153 days ago
ggerganov
deleted the gg/embd-fix-fa branch
153 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
ngxson
Assignees
No one assigned
Labels
examples
python
server
ggml
Apple Metal
Milestone
No milestone
Login to write a write a comment.
Login via GitHub