[`Core generation`] Adds support for static KV cache #27931
initial commit
17b8b389
lol
80ef8159
nits
2639b5d7
nits nits nits nits nits
9f2e1e4e
ArthurZucker
changed the title [`Core genration`] Adds support for static KV cache [`Core generation`] Adds support for static KV cache 2 years ago
Merge branch 'main' of github.com:huggingface/transformers into stati…
271260c4
Merge branch 'main' of github.com:huggingface/transformers into stati…
5be65ffa
some nits and some testing
c6b6d35c
nits
90224dd5
Wrong implementation but creates good masks in general and is pretty …
24ffbfb4
what seems to work for now
cd95e98f
nites
7cd36555
re-init cache
eeebc664
make it automatic
5819a854
nits and nits
216dd8f5
more nits
a48ae88c
nits
aeefa263
nits
e05f8da1
more nits
07f5cdca
nits
f769b0ea
fastest working cache for now
bb6a1600
also include the attention mask
dd1e42cb
gante
commented
on 2024-01-10
updates
a3b00030
current state
dacd0fff
working code
021f6744
dummy mask for now
98af8522
Merge branch 'main' of github.com:huggingface/transformers into stati…
85946706
Merge branch 'static-cache' of github.com:huggingface/transformers in…
60af2937
Merge branch 'main' of github.com:huggingface/transformers into stati…
05166fe9
a better design
9c1a3b4c
some fix
d5395aff
make outputs match
a20a183e
fastest yet
bce76533
remove chunck qkv
0e59f70f
cleanup
e5730005
some test
fce7e467
goat changes
24ef3cf2
nits
344309f4
dynamic was not working anymore
42e5a383
cache reverts
66377554
small nits
6ec92df2
sdpa
d7849275
Merge branch 'static-cache' of github.com:huggingface/transformers in…
0332d3fb
make sure sdpa passed
4e407036
nit
770c5e64
cleqnups
7bd1fca0
gante
commented
on 2024-01-31
cleanup
25fd440d
nits
4c3220fd
Merge branch 'main' of github.com:huggingface/transformers into stati…
d51acfa9
pass sdpa
2b2e0c25
make sure dynamic is BC
4b933790
update check on the attn weight
ab07e802
Merge branch 'static-cache' of https://github.com/huggingface/transfo…
77ccdcec
faster?
ad6832a4
add `_reset_cache`
1cb6a16d
Merge branch 'static-cache' of github.com:huggingface/transformers in…
d0442637
nit
c8383523
Merge branch 'static-cache' of https://github.com/huggingface/transfo…
e80b6a1e
nit
8308809d
Merge branch 'static-cache' of github.com:huggingface/transformers in…
0132a2c4
merges
87b3064d
Styling
4d88605b
nites
011931ec
revert some BC breaking changes
e838f57b
make all tests pass
c23815a4
torch long not float for attention mask
c9850643
try to remove the guard
6a954d59
BC
45760d6f
even more cleanup
64f54553
fix `past_key_value.get_usable_length(kv_seq_len, self.layer_idx)`
f103454a
pushh a fast version
c7b5d2c0
what actually works
538ccf0a
no contigious()
ce42624e
push for eager as well
33832d20
simplest and best way to do it yet
8a53f537
merge
f560fe5b
style
5f90ed47
Merge branch 'main' of github.com:huggingface/transformers into stati…
e5c731e4
dix dtype
b6c91807
fix dtype issues
8de700fe
nits
e92b1a03
nit
d9f7f163
support export to torchscript
d98f2778
Credit helpers
65217dea
nits
a2192366
ArthurZucker
marked this pull request as ready for review 1 year ago
handle SDPA edge cases
7a6b57da
handle sdpa quircks
28224231
revert performance break
70df80e6
gante
approved these changes
on 2024-02-05
Apply suggestions from code review
b4fbf3fc
fix merges
70d5ded5
revert removing ```
ec22fb18
add another test
9968b0e0
update test
dc885ca5
Merge branch 'static-cache' of https://github.com/huggingface/transfo…
0c2a66fb
use a model that is not protected
e087adc9
only test generation
c0cf2942
update the cache utils to define the position_ids in the cache class
da720c85
fix static cache
8f4c49dc
add subtest to llama tests
c22d564a
update testing suite
89929b9c
nuke whatever we can
d4b24ee5
smthing wrong with cache
d7e400e3
nit
9d9eec32
latest changes
4eb8a9e0
Merge branch 'main' of https://github.com/huggingface/transformers in…
dad35d62
don't use einsum
6f516a08
nit
f25ac8e0
remove one unused var
17f03509
update test value
b91efbb6
let style be happy
256c324b
make sure cache tests are slow
327b77a3
slow was removed add it back to test cach utils
8509e913
fix flash_attention_2
60aa86da
very small nit
7de4ace3
revert test change
453df240
make mistral the default copied from
0a1f8d2c
fix copies
040b2f19
nits
1763ec7d
finishup
c4242c8b
fixup
af097af7
Merge branch 'main' of https://github.com/huggingface/transformers in…
5bbde6f4
skip tests
7f8ca33b
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub