transformers
🚨All attention refactor🚨
#35235
Merged

🚨All attention refactor🚨 #35235

Cyrilvallez merged 99 commits into main from all-attention-refactor
ArthurZucker
ArthurZucker ArthurZucker force pushed to d1aa9ce1 1 year ago
ArthurZucker
ArthurZucker commented on 2024-12-13
ArthurZucker refactor LlamaAttention
79cb53c1
ArthurZucker minimal changes
4bb485b4
ArthurZucker fix llama
f3709070
ArthurZucker update
d3ef5397
ArthurZucker modular gemmas
45eac582
ArthurZucker modular nits
e52af494
ArthurZucker modular updates
5ed37aee
ArthurZucker nits
38cafc1a
ArthurZucker simplify
a862eacd
ArthurZucker gpt2
5639b81a
ArthurZucker more modualr and fixes
452d8edc
ArthurZucker granite
81a0b664
ArthurZucker modular modular modular
bc72c3f5
ArthurZucker nits
48caa890
ArthurZucker update
df68dd0d
Cyrilvallez qwen2 + starcoder2
0325dc46
Cyrilvallez mostly gemma2
ecd814bd
Cyrilvallez Cyrilvallez force pushed from 8b568230 to ecd814bd 1 year ago
Cyrilvallez Update image_processing_auto.py
f5fc638d
Cyrilvallez fix
5e56d9c0
Cyrilvallez Update modular_starcoder2.py
598b7bb5
Cyrilvallez fix
0f565fbf
Cyrilvallez remove all copied from attentions
c9ac84d4
ArthurZucker remove gcv
d189fe74
ArthurZucker make fix-copies
9c83d969
ArthurZucker oups
138368ec
ArthurZucker oups2.0
7225a4f3
Cyrilvallez fix some modulars + all copied from
a3b9195f
ArthurZucker should be good now
8d93708e
Cyrilvallez Merge branch 'all-attention-refactor' of github.com:huggingface/trans…
3cc2b4df
Cyrilvallez Merge branch 'all-attention-refactor' of github.com:huggingface/trans…
074e469d
Cyrilvallez revert unwanted changes
54d9b954
Cyrilvallez Update modeling_decision_transformer.py
944e26e9
Cyrilvallez finish cleanup
911833f8
Cyrilvallez Update modeling_olmo.py
ea269109
Cyrilvallez consistency
bc421af3
Cyrilvallez re-add gradient checkpointing attribute
8664ddcd
Cyrilvallez fix
607e928e
Cyrilvallez style
46125952
HuggingFaceDocBuilderDev
Cyrilvallez make config necessary
20c376cb
Cyrilvallez bis
0ac9db2b
Cyrilvallez bis
349b7ab8
Cyrilvallez Update modeling_my_new_model2.py
defa88ff
Cyrilvallez is_causal attr
fbf4b552
Cyrilvallez fix
9104d0a4
Cyrilvallez remove past kv return from decoder layer
0b093400
Cyrilvallez fix
46a0df79
Cyrilvallez default rope config
aedd88a7
Cyrilvallez correctly fix rope config
57e9b49f
Cyrilvallez fix bias
fe90ec05
Cyrilvallez fix gpt2 attention output
a3f50d0f
Cyrilvallez fix test
6a92c706
Cyrilvallez fix inits
a28ad195
Cyrilvallez fix default sdpa
9bd6c948
Cyrilvallez fix default sdpa implementation
fae05e16
Cyrilvallez harmonize classes
838d211d
Cyrilvallez fix mistral
e0d10f65
Cyrilvallez fix sliding window models
b275fdc8
Cyrilvallez mixtral
71eb6a2d
Cyrilvallez be more explicit
4e257534
Cyrilvallez style
1e8712b2
Cyrilvallez fix
854537b2
Cyrilvallez several fixes
99bddf01
Cyrilvallez Update modeling_dbrx.py
2f666b35
Cyrilvallez fix test
bafa020d
Cyrilvallez olmo + phi
00a98e71
Cyrilvallez rotary
8c254112
Cyrilvallez syle
4bb2f257
Cyrilvallez phi
44ff5e3d
Cyrilvallez phi again
95f7b963
Cyrilvallez again
7d550361
Cyrilvallez kwargs
24ac9ab8
Cyrilvallez Update test_modeling_common.py
bd8ede8a
Cyrilvallez skip fx tracing tests
0d3d3e39
Cyrilvallez Update modeling_utils.py
49135d04
Cyrilvallez gemma 2
f80a2c33
Cyrilvallez again
3e461bd1
Cyrilvallez Update modeling_recurrent_gemma.py
7a882d55
Cyrilvallez gemma2
78700734
Cyrilvallez granite
5b4ebaad
Cyrilvallez style
7bdf61c6
Cyrilvallez starcoder
7d5b0b53
Cyrilvallez Update sdpa_attention.py
70ef2fd9
Cyrilvallez switch args
b8429c5e
Cyrilvallez Update modeling_mllama.py
533657c9
Cyrilvallez fix
fe20d63a
Cyrilvallez cache type tests
248a6072
ArthurZucker
ArthurZucker commented on 2024-12-17
Cyrilvallez gpt2
46460142
Cyrilvallez Update test_modeling_common.py
ad16b1bd
Cyrilvallez fix
1df6e29b
Cyrilvallez consistency
6c01005c
Cyrilvallez fix shape with encoder
f651cd0d
Cyrilvallez should be the last one
98b7f974
Cyrilvallez tests non model
88e2fe56
ArthurZucker
ArthurZucker commented on 2024-12-18
ArthurZucker ArthurZucker marked this pull request as ready for review 1 year ago
Cyrilvallez most comments
5a3bdc44
Cyrilvallez small oupsi
f3923b6e
ArthurZucker ArthurZucker changed the title All attention refactor 🚨All attention refactor🚨 1 year ago
Cyrilvallez be more explicit in modulars
a6a2ff9e
Cyrilvallez more explicit modulars
aeea33bd
Cyrilvallez CIs! it works locally
ec3bef3d
LysandreJik
LysandreJik approved these changes on 2024-12-18
Cyrilvallez add kwargs to _flash_attention_forward
fc74e397
Cyrilvallez Cyrilvallez merged 2c47618c into main 1 year ago
Cyrilvallez Cyrilvallez deleted the all-attention-refactor branch 1 year ago
Cyrilvallez
ydshieh
vasqu
vasqu commented on 2024-12-18
Cyrilvallez
SimJeg
ArthurZucker
SimJeg
ArthurZucker
poedator
Cyrilvallez
ArthurZucker
BenjaminBossan
BenjaminBossan
Cyrilvallez
Cyrilvallez
BenjaminBossan
ArthurZucker
BenjaminBossan
foreverpiano
ArthurZucker
ArthurZucker
foreverpiano
foreverpiano
Cyrilvallez
poedator
Rocketknight1
gante
poedator
ArthurZucker
yuanyao-nv
yuanyao-nv commented on 2025-06-30

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone