llama.cpp
1dc9614e - llama : fix kq_scale for the attention layers of PLaMo2 (#14892)

Commit
137 days ago
llama : fix kq_scale for the attention layers of PLaMo2 (#14892) * Fix dimensions for expand * Change dimensions to copy states to cache * Fix the default value for plamo2 conversion * Fix scale given to build_attn * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Author
Parents
Loading