transformers
Static Cache: load models with MQA or GQA
#28975
Merged

Static Cache: load models with MQA or GQA #28975

gante merged 4 commits into huggingface:main from gante:tiny_llama
gante
gante load models with MHA or GQA
0b8888fa
gante load models with MHA or GQA
6e43bd02
HuggingFaceDocBuilderDev
gante gante changed the title Static Cache: load models with MHA or GQA Static Cache: load models with MQA or GQA 2 years ago
gante add test; make fixup
49fab562
gante gante marked this pull request as ready for review 2 years ago
gante gante requested a review from ArthurZucker ArthurZucker 2 years ago
gante better fn name
f0131f88
ArthurZucker
ArthurZucker approved these changes on 2024-02-13
gante gante merged 3e70a207 into main 2 years ago
gante gante deleted the tiny_llama branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone