llama.cpp
b42c8b43
- refactor: Remove layer index from n_embd_k/v_s
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
229 days ago
refactor: Remove layer index from n_embd_k/v_s Now that it's not used at all in the unified cache, we don't need to use the layer index to zero it out for attention layers. Branch: HybridRecurrentCache Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
References
#13979 - Hybrid recurrent cache
Author
gabe-l-hart
Committer
gabe-l-hart
Parents
1dd12133
Loading