llama.cpp
b42c8b43 - refactor: Remove layer index from n_embd_k/v_s

Commit

229 days ago

refactor: Remove layer index from n_embd_k/v_s Now that it's not used at all in the unified cache, we don't need to use the layer index to zero it out for attention layers. Branch: HybridRecurrentCache Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

References

#13979 - Hybrid recurrent cache

Author

gabe-l-hart

Committer

gabe-l-hart

Parents

1dd12133

llama.cpp b42c8b43 - refactor: Remove layer index from n_embd_k/v_s

llama.cpp
b42c8b43 - refactor: Remove layer index from n_embd_k/v_s