transformers
Offloaded KV Cache
#31325
Merged

Offloaded KV Cache #31325

n17s
amyeroberts
gante
gante commented on 2024-06-14
n17s
HuggingFaceDocBuilderDev
n17s
gante
gante approved these changes on 2024-06-27
gante gante requested a review from ArthurZucker ArthurZucker 1 year ago
gante
n17s n17s force pushed 1 year ago
n17s
n17s n17s force pushed 1 year ago
n17s
n17s
ArthurZucker
n17s n17s force pushed 1 year ago
ArthurZucker
ArthurZucker commented on 2024-07-22
n17s
ArthurZucker
ArthurZucker commented on 2024-07-23
n17s Initial implementation of OffloadedCache
8e57b081
n17s enable usage via cache_implementation
d0e86661
n17s Address feedback, add tests, remove legacy methods.
2e63564f
n17s Remove flash-attn, discover synchronization bugs, fix bugs
cf31e0ea
n17s n17s force pushed to cf31e0ea 1 year ago
n17s
n17s Prevent usage in CPU only mode
daf8702e
n17s
ArthurZucker
ArthurZucker commented on 2024-07-24
ArthurZucker
n17s Add a section about offloaded KV cache to the docs
47950328
n17s
n17s Fix typos in docs
1a76762f
ArthurZucker
ArthurZucker commented on 2024-07-26
n17s Clarifications and better explanation of streams
667811a6
ArthurZucker
ArthurZucker approved these changes on 2024-08-01
ArthurZucker ArthurZucker merged ca59d6f7 into main 1 year ago
ghadiaravi13

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone