π΄[`Attention`] Bert-based Models Attention Refactor #38301
clean start to bert refactor
4c7d3dcb
vasqu
changed the title π΄[`Atttention`] Bert-based Models Attention Refactor π΄[`Attention`] Bert-based Models Attention Refactor 306 days ago
Merge branch 'main' into vas-bert-attn-refactors
ae0adfe3
some test fixes
6c1e5f47
style
d3c1f369
fix last tests
3076c99f
be strict on positional embeddings, fixup according tests
6afc75bf
cache support
1eaca54b
more cache fixes, new causal API
3e591058
simplify masks, fix tests for gen
e376e3cf
flex attn, static cache support, round of fixes
01227640
vasqu
commented
on 2025-06-30
?
f46d6a48
this time
13f5b49f
style
82633afe
fix flash attention tests, flex attention requires torch 2.7.x to worβ¦
41ddb572
Merge branch 'main' into vas-bert-attn-refactors
775573e0
vasqu
commented
on 2025-06-30
roberta
6a7357de
fixup sdpa remains
d1c76901
Merge branch 'main' into vas-bert-attn-refactors
b82b47e5
attention split, simplify args and kwargs, better typing
306a5c2a
fix encoder decoder
38e8de31
fix test
5120ca6c
modular roberta
11de15bd
albert
dd7aeca4
data2vectext, making it modular tomorrow
ad3ffe55
modular data2vec text
52d2052b
tmp disable
baaa3ecc
xmod + cache position fixes
8fa32ca9
whoops
786230b4
electra + markuplm, small fixes
1865eb33
remove wrong copy
32cd8d2c
xlm_roberta + some embedding fixes
f199dec4
roberta prelayernorm
cfcb2678
RemBert: remove copy, maybe doing it later
95210dd5
Merge branch 'main' into vas-bert-attn-refactors
ca7c9304
Merge branch 'main' into vas-bert-attn-refactors
d4fbfd91
ernie
396c1ec8
fix roberta offloading
5a05d5a5
camembert
d9f0a8a3
copy fixes
367fe5d0
bert generation + fixes on eager
369ede62
xlm roberta xl
b11d91e4
bridgetower (text) + seamlessv2 copy fixes
34f5f3f1
rocbert + small fixes
e0f1e83d
Merge branch 'main' into vas-bert-attn-refactors
7abb76c5
whoops
2d703c3c
small round of fixups
ba3d115d
NOTE: kernels didnt load with an earlier version, some fixup (needs aβ¦
b40a7041
the end of the tunnel?
30dcc8cb
fixup nllbmoe + style
d4481a24
we dont need this anymore
981adfab
megatron bert is barely used, low prio skip for now
165d738b
Merge branch 'main' into vas-bert-attn-refactors
1d3aec5d
Modernize bert (template for others)
2f9f7c7b
check inputs for all others (if checkmarked)
a9af620c
fix bridgetower
0254898f
style
e773a81a
fix encoder decoder (partially but cause found and fix also, just neeβ¦
fc9fd97c
proper fix for bert to force intermediate dict outputs
58a06808
propagate to others
5179154c
style
3da6c33a
vasqu
marked this pull request as ready for review 188 days ago
Merge branch 'main' into vas-bert-attn-refactors
986aaef4
xlm roberta xl investigation, its the layernorm...
df5aa367
mobile bert
b8ee99da
another day another merge conflict
f0560254
Merge branch 'main' into vas-bert-attn-refactors
9d1a118e
revert this, might cause issues with composed models
422aaa3b
review
ff2af474
style
f1ef07a1
Merge branch 'main' into vas-bert-attn-refactors
4f7d3ada
another merge conflict, this will never end lol
1c3188fb
ArthurZucker
deleted the vas-bert-attn-refactors branch 186 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub