transformers
πŸ”΄[`Attention`] Bert-based Models Attention Refactor
#38301
Merged

πŸ”΄[`Attention`] Bert-based Models Attention Refactor #38301

ArthurZucker merged 70 commits into main from vas-bert-attn-refactors
vasqu
vasqu clean start to bert refactor
4c7d3dcb
vasqu vasqu changed the title πŸ”΄[`Atttention`] Bert-based Models Attention Refactor πŸ”΄[`Attention`] Bert-based Models Attention Refactor 306 days ago
HuggingFaceDocBuilderDev
vasqu Merge branch 'main' into vas-bert-attn-refactors
ae0adfe3
vasqu some test fixes
6c1e5f47
vasqu style
d3c1f369
vasqu fix last tests
3076c99f
vasqu be strict on positional embeddings, fixup according tests
6afc75bf
vasqu cache support
1eaca54b
vasqu more cache fixes, new causal API
3e591058
vasqu simplify masks, fix tests for gen
e376e3cf
vasqu flex attn, static cache support, round of fixes
01227640
vasqu
vasqu commented on 2025-06-30
vasqu ?
f46d6a48
vasqu this time
13f5b49f
vasqu style
82633afe
vasqu fix flash attention tests, flex attention requires torch 2.7.x to wor…
41ddb572
vasqu Merge branch 'main' into vas-bert-attn-refactors
775573e0
vasqu
github-actions
vasqu
vasqu commented on 2025-06-30
ArthurZucker
ArthurZucker commented on 2025-06-30
vasqu roberta
6a7357de
vasqu fixup sdpa remains
d1c76901
vasqu Merge branch 'main' into vas-bert-attn-refactors
b82b47e5
vasqu attention split, simplify args and kwargs, better typing
306a5c2a
vasqu fix encoder decoder
38e8de31
vasqu fix test
5120ca6c
vasqu modular roberta
11de15bd
vasqu albert
dd7aeca4
vasqu data2vectext, making it modular tomorrow
ad3ffe55
vasqu modular data2vec text
52d2052b
vasqu tmp disable
baaa3ecc
vasqu xmod + cache position fixes
8fa32ca9
vasqu whoops
786230b4
vasqu electra + markuplm, small fixes
1865eb33
vasqu remove wrong copy
32cd8d2c
github-actions
vasqu xlm_roberta + some embedding fixes
f199dec4
vasqu roberta prelayernorm
cfcb2678
vasqu RemBert: remove copy, maybe doing it later
95210dd5
vasqu Merge branch 'main' into vas-bert-attn-refactors
ca7c9304
ccdv-ai
vasqu
vasqu Merge branch 'main' into vas-bert-attn-refactors
d4fbfd91
vasqu ernie
396c1ec8
vasqu fix roberta offloading
5a05d5a5
vasqu camembert
d9f0a8a3
vasqu copy fixes
367fe5d0
vasqu bert generation + fixes on eager
369ede62
vasqu xlm roberta xl
b11d91e4
vasqu bridgetower (text) + seamlessv2 copy fixes
34f5f3f1
vasqu rocbert + small fixes
e0f1e83d
robertgshaw2-redhat
vasqu
vasqu Merge branch 'main' into vas-bert-attn-refactors
7abb76c5
vasqu whoops
2d703c3c
ArthurZucker
vasqu small round of fixups
ba3d115d
vasqu NOTE: kernels didnt load with an earlier version, some fixup (needs a…
b40a7041
vasqu the end of the tunnel?
30dcc8cb
vasqu fixup nllbmoe + style
d4481a24
vasqu we dont need this anymore
981adfab
vasqu megatron bert is barely used, low prio skip for now
165d738b
vasqu Merge branch 'main' into vas-bert-attn-refactors
1d3aec5d
vasqu Modernize bert (template for others)
2f9f7c7b
vasqu check inputs for all others (if checkmarked)
a9af620c
vasqu fix bridgetower
0254898f
vasqu style
e773a81a
vasqu fix encoder decoder (partially but cause found and fix also, just nee…
fc9fd97c
vasqu proper fix for bert to force intermediate dict outputs
58a06808
vasqu propagate to others
5179154c
vasqu style
3da6c33a
vasqu vasqu marked this pull request as ready for review 188 days ago
vasqu Merge branch 'main' into vas-bert-attn-refactors
986aaef4
vasqu xlm roberta xl investigation, its the layernorm...
df5aa367
vasqu mobile bert
b8ee99da
vasqu another day another merge conflict
f0560254
vasqu Merge branch 'main' into vas-bert-attn-refactors
9d1a118e
vasqu revert this, might cause issues with composed models
422aaa3b
ArthurZucker
ArthurZucker approved these changes on 2025-09-18
vasqu review
ff2af474
vasqu style
f1ef07a1
vasqu Merge branch 'main' into vas-bert-attn-refactors
4f7d3ada
ArthurZucker
github-actions
vasqu another merge conflict, this will never end lol
1c3188fb
vasqu
github-actions
vasqu
ArthurZucker ArthurZucker merged 155f7e2e into main 186 days ago
ArthurZucker ArthurZucker deleted the vas-bert-attn-refactors branch 186 days ago
Cyrilvallez
Cyrilvallez commented on 2025-09-23
BenjaminBossan

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone