llama.cpp
a5eaa1d6 - mla : make the V tensor a view of K (#18986)

Commit
5 days ago
mla : make the V tensor a view of K (#18986) * mla : pass V as a view of K to the FA op * cuda : adjust mla logic to new layout * kv-cache : fix rope shift * tests : remove comment * cuda : fix reusable_cutoff Co-authored-by: Johannes Gäßler <johannesg@5d6.de> --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Author
Parents
Loading