transformers
Offloaded hybrid cache for Llama4
#37401
Merged

Offloaded hybrid cache for Llama4 #37401

ArthurZucker merged 5 commits into main from offloaded-hybrid
Cyrilvallez
Cyrilvallez first try (maybe race condition)
df402374
Cyrilvallez Update cache_utils.py
ee6674cd
Cyrilvallez cannot avoid the race condition -> use 2 layers
0a0b495e
Cyrilvallez Update cache_utils.py
72d63d5e
Cyrilvallez Update cache_utils.py
f78c5fba
github-actions github-actions marked this pull request as draft 350 days ago
github-actions
Cyrilvallez Cyrilvallez marked this pull request as ready for review 350 days ago
github-actions github-actions requested a review from ArthurZucker ArthurZucker 350 days ago
github-actions github-actions requested a review from Rocketknight1 Rocketknight1 350 days ago
HuggingFaceDocBuilderDev
ArthurZucker
ArthurZucker approved these changes on 2025-04-10
ArthurZucker ArthurZucker merged fbb2054e into main 349 days ago
ArthurZucker ArthurZucker deleted the offloaded-hybrid branch 349 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone