transformers
[`Core generation`] Adds support for static KV cache
#27931
Merged

[`Core generation`] Adds support for static KV cache #27931

ArthurZucker merged 121 commits into main from static-cache
ArthurZucker
ArthurZucker initial commit
17b8b389
ArthurZucker lol
80ef8159
ArthurZucker nits
2639b5d7
ArthurZucker nits nits nits nits nits
9f2e1e4e
HuggingFaceDocBuilderDev
ArthurZucker ArthurZucker changed the title [`Core genration`] Adds support for static KV cache [`Core generation`] Adds support for static KV cache 2 years ago
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into stati…
271260c4
patrickvonplaten
patrickvonplaten commented on 2023-12-26
patrickvonplaten
patrickvonplaten commented on 2023-12-26
oobabooga
ArthurZucker
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into stati…
5be65ffa
ArthurZucker some nits and some testing
c6b6d35c
ArthurZucker nits
90224dd5
ArthurZucker Wrong implementation but creates good masks in general and is pretty …
24ffbfb4
ArthurZucker what seems to work for now
cd95e98f
ArthurZucker nites
7cd36555
ArthurZucker re-init cache
eeebc664
ArthurZucker make it automatic
5819a854
ArthurZucker nits and nits
216dd8f5
ArthurZucker more nits
a48ae88c
ArthurZucker nits
aeefa263
ArthurZucker nits
e05f8da1
ArthurZucker more nits
07f5cdca
patrickvonplaten
ArthurZucker nits
f769b0ea
ArthurZucker fastest working cache for now
bb6a1600
ArthurZucker also include the attention mask
dd1e42cb
gante
gante commented on 2024-01-10
ArthurZucker
ArthurZucker updates
a3b00030
ArthurZucker current state
dacd0fff
ArthurZucker working code
021f6744
ArthurZucker dummy mask for now
98af8522
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into stati…
85946706
ArthurZucker Merge branch 'static-cache' of github.com:huggingface/transformers in…
60af2937
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into stati…
05166fe9
hhllxx1121
ArthurZucker
ArthurZucker a better design
9c1a3b4c
ArthurZucker some fix
d5395aff
ArthurZucker make outputs match
a20a183e
ArthurZucker fastest yet
bce76533
ArthurZucker remove chunck qkv
0e59f70f
ArthurZucker cleanup
e5730005
ArthurZucker some test
fce7e467
ArthurZucker goat changes
24ef3cf2
ArthurZucker nits
344309f4
ArthurZucker dynamic was not working anymore
42e5a383
ArthurZucker
ArthurZucker commented on 2024-01-29
ArthurZucker cache reverts
66377554
ArthurZucker small nits
6ec92df2
ArthurZucker sdpa
d7849275
ArthurZucker Merge branch 'static-cache' of github.com:huggingface/transformers in…
0332d3fb
ArthurZucker make sure sdpa passed
4e407036
ArthurZucker nit
770c5e64
ArthurZucker cleqnups
7bd1fca0
younesbelkada
younesbelkada commented on 2024-01-29
ArthurZucker ArthurZucker requested a review from gante gante 1 year ago
patrickvonplaten
patrickvonplaten commented on 2024-01-30
ydshieh
ydshieh commented on 2024-01-30
ydshieh
ydshieh commented on 2024-01-30
ydshieh
ydshieh commented on 2024-01-30
gante
gante commented on 2024-01-31
ArthurZucker cleanup
25fd440d
ArthurZucker nits
4c3220fd
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into stati…
d51acfa9
ArthurZucker pass sdpa
2b2e0c25
ArthurZucker make sure dynamic is BC
4b933790
ArthurZucker update check on the attn weight
ab07e802
ArthurZucker Merge branch 'static-cache' of https://github.com/huggingface/transfo…
77ccdcec
ArthurZucker faster?
ad6832a4
ArthurZucker add `_reset_cache`
1cb6a16d
ArthurZucker Merge branch 'static-cache' of github.com:huggingface/transformers in…
d0442637
ArthurZucker nit
c8383523
ArthurZucker Merge branch 'static-cache' of https://github.com/huggingface/transfo…
e80b6a1e
ArthurZucker nit
8308809d
ArthurZucker Merge branch 'static-cache' of github.com:huggingface/transformers in…
0132a2c4
ArthurZucker merges
87b3064d
ArthurZucker Styling
4d88605b
ArthurZucker
ArthurZucker commented on 2024-02-01
ArthurZucker nites
011931ec
ArthurZucker revert some BC breaking changes
e838f57b
ArthurZucker make all tests pass
c23815a4
ArthurZucker torch long not float for attention mask
c9850643
ArthurZucker try to remove the guard
6a954d59
ArthurZucker BC
45760d6f
ArthurZucker even more cleanup
64f54553
ArthurZucker fix `past_key_value.get_usable_length(kv_seq_len, self.layer_idx)`
f103454a
ArthurZucker pushh a fast version
c7b5d2c0
ArthurZucker what actually works
538ccf0a
ArthurZucker no contigious()
ce42624e
ArthurZucker push for eager as well
33832d20
ArthurZucker simplest and best way to do it yet
8a53f537
ArthurZucker merge
f560fe5b
ArthurZucker style
5f90ed47
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into stati…
e5c731e4
ArthurZucker dix dtype
b6c91807
ArthurZucker fix dtype issues
8de700fe
ArthurZucker nits
e92b1a03
fxmarty
fxmarty commented on 2024-02-02
ArthurZucker
ArthurZucker nit
d9f7f163
fxmarty
fxmarty commented on 2024-02-02
ArthurZucker support export to torchscript
d98f2778
ArthurZucker Credit helpers
65217dea
ArthurZucker nits
a2192366
fxmarty
fxmarty commented on 2024-02-02
ArthurZucker ArthurZucker marked this pull request as ready for review 1 year ago
ArthurZucker handle SDPA edge cases
7a6b57da
ArthurZucker handle sdpa quircks
28224231
ArthurZucker revert performance break
70df80e6
gante
gante approved these changes on 2024-02-05
oobabooga
oobabooga commented on 2024-02-05
ArthurZucker Apply suggestions from code review
b4fbf3fc
ArthurZucker fix merges
70d5ded5
ArthurZucker revert removing ```
ec22fb18
ArthurZucker add another test
9968b0e0
ArthurZucker update test
dc885ca5
ArthurZucker Merge branch 'static-cache' of https://github.com/huggingface/transfo…
0c2a66fb
ArthurZucker use a model that is not protected
e087adc9
ArthurZucker only test generation
c0cf2942
ArthurZucker update the cache utils to define the position_ids in the cache class
da720c85
ArthurZucker fix static cache
8f4c49dc
ArthurZucker add subtest to llama tests
c22d564a
ArthurZucker update testing suite
89929b9c
ArthurZucker nuke whatever we can
d4b24ee5
ArthurZucker smthing wrong with cache
d7e400e3
ArthurZucker nit
9d9eec32
younesbelkada
younesbelkada commented on 2024-02-06
younesbelkada
younesbelkada commented on 2024-02-06
younesbelkada
younesbelkada commented on 2024-02-06
younesbelkada
younesbelkada commented on 2024-02-06
younesbelkada
younesbelkada commented on 2024-02-06
ArthurZucker
ArthurZucker latest changes
4eb8a9e0
ArthurZucker Merge branch 'main' of https://github.com/huggingface/transformers in…
dad35d62
ArthurZucker don't use einsum
6f516a08
ArthurZucker nit
f25ac8e0
ArthurZucker remove one unused var
17f03509
ArthurZucker update test value
b91efbb6
ArthurZucker let style be happy
256c324b
ArthurZucker make sure cache tests are slow
327b77a3
ArthurZucker slow was removed add it back to test cach utils
8509e913
ArthurZucker fix flash_attention_2
60aa86da
ArthurZucker very small nit
7de4ace3
ArthurZucker revert test change
453df240
ArthurZucker make mistral the default copied from
0a1f8d2c
ArthurZucker fix copies
040b2f19
ArthurZucker nits
1763ec7d
ArthurZucker finishup
c4242c8b
ArthurZucker fixup
af097af7
ArthurZucker Merge branch 'main' of https://github.com/huggingface/transformers in…
5bbde6f4
ArthurZucker skip tests
7f8ca33b
ArthurZucker ArthurZucker merged 115ac94d into main 1 year ago
ArthurZucker ArthurZucker deleted the static-cache branch 1 year ago
ArthurZucker
tsengalb99
amyeroberts
patrickvonplaten
patrickvonplaten commented on 2024-02-09
ZachariahMarrero
tsengalb99
ArthurZucker
chauhang
ArthurZucker
fxmarty
xkszltl
paulcx
gante
ArthurZucker
fxmarty
aliencaocao
paulcx
ArthurZucker
ArthurZucker
paulcx
ArthurZucker
paulcx
ArthurZucker
aliencaocao
ArthurZucker
ArthurZucker
aliencaocao
ArthurZucker
aliencaocao
nxphi47
ArthurZucker

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone