transformers
fbb2054e
- Offloaded hybrid cache for Llama4 (#37401)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
248 days ago
Offloaded hybrid cache for Llama4 (#37401) * first try (maybe race condition) * Update cache_utils.py * cannot avoid the race condition -> use 2 layers * Update cache_utils.py * Update cache_utils.py
References
#37401 - Offloaded hybrid cache for Llama4
Author
Cyrilvallez
Parents
6d8b0b33
Loading