llama.cpp
423c8940 - feat: Construct hybrid recurrent cache for hybrid recurrent models

Commit
194 days ago
feat: Construct hybrid recurrent cache for hybrid recurrent models This includes a refactor of the create_memory logic to avoid needing to use the arch enum explicitly unless a model needs explicit cache instantiation logic beyond the standard logic for recurrent, hybrid, unified, and iswa. Branch: HybridRecurrentCache Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>
Author
Committer
Parents
Loading