Static Cache: load models with MQA or GQA #28975
load models with MHA or GQA
0b8888fa
load models with MHA or GQA
6e43bd02
gante
changed the title Static Cache: load models with MHA or GQA Static Cache: load models with MQA or GQA 2 years ago
add test; make fixup
49fab562
gante
marked this pull request as ready for review 2 years ago
better fn name
f0131f88
gante
merged
3e70a207
into main 2 years ago
gante
deleted the tiny_llama branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub