transformers
Refactor weight loading
#41580
Merged

Refactor weight loading #41580

ArthurZucker merged 387 commits into main from refactor-weight-loading
ArthurZucker
ArthurZucker
ArthurZucker commented on 2025-10-14
ArthurZucker
ArthurZucker commented on 2025-10-14
ArthurZucker
ArthurZucker commented on 2025-10-14
HuggingFaceDocBuilderDev
molbap molbap added Core: Modeling
ArthurZucker ArthurZucker added for_v5?
ArthurZucker ArthurZucker marked this pull request as ready for review 119 days ago
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik commented on 2025-10-30
LysandreJik
LysandreJik approved these changes on 2025-10-30
ArthurZucker ah actually we don't discard lm head if missing -> needs to be moved …
4d797099
ArthurZucker fix some tests
d1e84db3
ArthurZucker small fixes
f2938df8
ArthurZucker up
22fcdaf9
ArthurZucker up
7d78aa1b
ArthurZucker dik why we tie weights twice but,..,,.
80517f53
ArthurZucker ups
2ff85326
ArthurZucker removeunused
d923061e
ArthurZucker fix hunyuan
ce8c1c19
ArthurZucker small fix
23e3ed74
ArthurZucker nits
a8fb5540
ArthurZucker ish
ab6ee8ae
ArthurZucker up
77ccbb17
ArthurZucker rev
8a8beff7
ArthurZucker fix more tie weights keys
02386ce7
ArthurZucker small fixes
1c87945a
ArthurZucker nit
00b95ee0
ArthurZucker update
a170f290
ArthurZucker fix and fix
8b924a3b
ArthurZucker fix a test
8f7b1d02
ArthurZucker glubs
93862177
ArthurZucker current shitty changes
4894a257
ArthurZucker ArthurZucker force pushed from e47febc8 to 4894a257 115 days ago
ArthurZucker ship validated ones
da7dc100
ArthurZucker more
d7c81717
ArthurZucker more update
e0884089
ArthurZucker more
4f212de4
ArthurZucker more
dc5a22c2
ArthurZucker more
675b2bca
ArthurZucker mllama
f85f2397
ArthurZucker more up
76b6a92d
ArthurZucker fix ernie
ba1a8b64
ArthurZucker fix xopies
ba3de5ad
ArthurZucker up more
8fd255c7
ArthurZucker more fixes
5d7507b1
ArthurZucker up
0fb23403
ArthurZucker up
32b92738
ArthurZucker fix-copies
0b95826c
ArthurZucker fix more
5794d27d
ArthurZucker more updates
5e71bd4a
ArthurZucker AI UPDATE
20d1b340
ArthurZucker up
89846e7d
ArthurZucker hoey
a581fd75
Cyrilvallez make it fast
1652c9c5
Cyrilvallez fix
dcad7030
ArthurZucker lol
c921cede
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
50714d8c
ArthurZucker fix asjusting
8936cc40
ArthurZucker more fixes
5c54332e
ArthurZucker _dtype nit
ff108789
ArthurZucker up
9601b82c
ArthurZucker nit
db02b9d7
ArthurZucker update
42fd4c43
ArthurZucker update
45271710
Cyrilvallez remove semaphores
bd362112
Cyrilvallez fix import to avoid jit execution
e2aefee7
ArthurZucker try to remove custom tiing logic when its stupid
74a0e9c7
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
ead2ac37
ArthurZucker fix more individual models
e7165da0
ArthurZucker fix whisper as well
2ff765e9
ArthurZucker fix?
912562c0
ArthurZucker fox umt5
c43495a5
Cyrilvallez improve tqdm bar
57988f25
Cyrilvallez cleanup a bit
8c16de16
Cyrilvallez oupsi
b8927d67
ArthurZucker some updates
2733ff69
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
8baa3fe9
Cyrilvallez improve
d91701f7
Cyrilvallez Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
5146dec4
Cyrilvallez remove all buffering -> much faster without it
acc5b245
ArthurZucker remove some tie_weights custome funcs when not needed
58389a1f
ArthurZucker more fixes related to strict matching regex
92c0229a
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
d9e7fe65
ArthurZucker remove ALL custom tie weights
b57d7897
BenjaminBossan
BenjaminBossan commented on 2025-11-05
ArthurZucker small update
ef8b6c35
Cyrilvallez revert change to init scheme (no need for params)
a228fd0a
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
07574ddd
Cyrilvallez mixtral init
2526cc5d
kylesayrs
kylesayrs commented on 2025-11-05
kylesayrs
kylesayrs commented on 2025-11-05
kylesayrs
kylesayrs commented on 2025-11-05
ArthurZucker try less strict source check
6cb37940
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
e4cadfb1
Cyrilvallez tied weight first shot to the fiiiixxxxxx
3fea8658
ArthurZucker does this help?
82f94b8a
ArthurZucker :)
84dd6eb2
ArthurZucker fix some ppolry defined tied_weights_keys for now
cc081954
ArthurZucker subclass nn.Parameters
f692f4bd
ArthurZucker ArthurZucker force pushed from 28b620d6 to f692f4bd 112 days ago
ArthurZucker up
2fa058fe
ArthurZucker lol
78d46227
ArthurZucker Ouiiii
8ff4ad56
ArthurZucker fix led
32226787
ArthurZucker fix long cat flash
9a76a6ee
ArthurZucker fix qwen and long cat flash
9fde9f78
ArthurZucker properly fix qwen init
074a449f
ArthurZucker just push this for now
dde5500d
ArthurZucker propnet is dumb
0e7d2d05
ArthurZucker update
18b02eea
ArthurZucker push
9c0db728
ArthurZucker remove explict sharing of some tied keys.
75d3afcb
ArthurZucker update decoder.bias
85ab0859
ArthurZucker moe case
443573ae
ArthurZucker more changes to untangle old hardcoded ting
f8f09734
ArthurZucker fixup
5c9d56cb
ArthurZucker Merge branch 'main' into refactor-weight-loading
a0029f20
ArthurZucker fix big faileurs
44943fb8
ArthurZucker fix prophnet
76d66be5
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
d176b489
ArthurZucker fix resize token embeddings
3ffc59ef
ArthurZucker nits
2a00e493
ArthurZucker fix xcodex
f7d0183d
ArthurZucker asyncio?
bbf5b000
ArthurZucker fix smart apply
04128324
ArthurZucker fix data-2-vec
c137ea33
ArthurZucker [build-ci-image]
7b7c9903
ArthurZucker checkout
de74aebb
ArthurZucker uupdate
94a53d4c
ArthurZucker fix hunyuan
8755a4be
ArthurZucker update error message
5be67b96
ArthurZucker fix deformable detr
86a4e516
ArthurZucker fixes
09bcd2ee
ArthurZucker fix init weights for non param gate up projs
7b457fd0
ArthurZucker shared todo?
e033947a
ArthurZucker update some models
f93f3570
ArthurZucker big revert, don't break this behaviour
2f0a6aed
ArthurZucker ty @SunMarc this fixes the buffers
3c8c7572
ArthurZucker mt5 fuck
f5a7c33d
ArthurZucker fix lxmbert
647f720a
ArthurZucker nuke slow test fetcher
bed6ea1c
ArthurZucker fix zamba and deepcopy for now
2ec0a5fb
ArthurZucker fix zamba tied weight keys! ~
f9c7ef87
ArthurZucker fix-copies
8df3ffd8
ArthurZucker update fetch terst
e76481b9
ArthurZucker fix gradient for test modeling common!
de007511
ArthurZucker break "shared" for now I will fix tomorrow changes are properly isoal…
cdd1a9b3
ArthurZucker does this fix marian? probably not
d3f64762
ArthurZucker fix some vlms
0a7db831
ArthurZucker D fine seems to handle this well
18142005
ArthurZucker glob is fine actually
b77825d3
ArthurZucker fix dab detr
5dbb7833
ArthurZucker small steps
9edc81b8
ArthurZucker opusy
970f4e53
ArthurZucker fix some more models?
0361d47d
ArthurZucker yups
dc757737
ArthurZucker better erro
cdb12846
ArthurZucker fix?
de9a2d98
ArthurZucker fix double escape
b9a9f4d8
ArthurZucker escape wehere it makes sense
c944619e
ArthurZucker ??
f9105240
ArthurZucker fix ibert
4aa2ade0
ArthurZucker fix tvp as well
2ef1c2b2
ArthurZucker more fxes
b98a7bce
ArthurZucker try always download ref PR
74e6c871
ArthurZucker ONONONO
5064edd1
ArthurZucker big fixup
3f8a304c
ArthurZucker more fixup
3ecaa63d
ArthurZucker small step
f384524e
ArthurZucker small nits
290337a2
ArthurZucker nits
76b388c9
ArthurZucker brut force some stuff
e69b988e
ArthurZucker fix vilt
c2781f57
ArthurZucker make sure special models that always need tie always tie
f64ee960
ArthurZucker cleaning up
a3e40152
ArthurZucker small nits
9eecbd27
ArthurZucker fix zamba and bridge tower!
b2fa432b
ArthurZucker just fixup
dbbfdf29
vasqu
vasqu commented on 2025-11-11
ArthurZucker potential culprits
ab4890c8
ArthurZucker revert bark and fix bridgetower
937ebf36
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into refac…
e4f9697f
ArthurZucker remove now non existant tie_weights
17803ce9
ArthurZucker ?
9f6838a2
ArthurZucker lol reformer actually had nothing tied!
1afb3eb5
ArthurZucker wow these two fucking models were really not well made
f01a149a
ArthurZucker fix sam family!
0b369802
ArthurZucker fix bark revision
d740c82b
ArthurZucker fix speech2test ?
6f3940ee
ArthurZucker push this for now....
b2f6f61a
ArthurZucker upsy
ade8dab4
ArthurZucker the fuck
f956ccfb
ArthurZucker fix rtdetr
99c6fd49
ArthurZucker update
1ffcfc3f
ArthurZucker proper
ee62aec5
ArthurZucker wow that one 's annoying
6ec80f86
ArthurZucker update
b05e3290
ArthurZucker try to find the culprit
2606596f
ArthurZucker get some help on common
d9e8a09d
ArthurZucker nit about general init and cls.padding_idx
581665ae
ArthurZucker revert num workers update
c43bc687
Cyrilvallez remove old loading func
b6fe4158
ArthurZucker fix glob
4bb8e5c9
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
7d52b063
Cyrilvallez add annotations
455bcc7c
Cyrilvallez Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
fc884c03
ArthurZucker fix re
2e0ed5d2
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
3ddd1cca
Cyrilvallez small improvements
1f86a104
Cyrilvallez fix conflict
4d56fbf1
Cyrilvallez clean some stuff
67a8eebb
Cyrilvallez improvements
e9168ff5
ArthurZucker someone did not understannnnnnd what I tried to dooo or does BNB not …
feda22d9
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
70841c9f
ArthurZucker gluos
52248ba3
ArthurZucker fix case when `.` is just not there
e8dd4a45
Cyrilvallez remove unused arg
1c67fc49
SunMarc recover orignal parameter/buffer using _original
e20ed001
ArthurZucker fix glob issu
827c42a2
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
e5e4d28e
ArthurZucker this?
4db2aa60
Cyrilvallez deepspeed best-effort
2b16c177
Cyrilvallez remove unused stuff
c411ddb2
ArthurZucker Update tie weight keys as they were just wroong
56d368b1
ArthurZucker up
85d0ac1e
ArthurZucker Merge branch 'refactor-weight-loading' of github.com:huggingface/tran…
daa642c1
ArthurZucker augustuc clauss, a gloubs gloups gloubs
bbf71b92
ArthurZucker fixup
127e4d56
ArthurZucker fixup
79541859
ArthurZucker there was fucking typo
f7cd4b3f
ArthurZucker mrain
f9e747e7
ArthurZucker nits
57bf5b28
ArthurZucker fix marian 3 remaining tests
c38ad244
ArthurZucker one more
d7be7df6
ArthurZucker fix some of the copies, not all :)
729e3df6
ArthurZucker small cleanup
c95a3f16
ArthurZucker one propertest
87788403
ArthurZucker fix core model loadig tes
1181e3f7
ArthurZucker attempt a new test
b750e6b9
ArthurZucker fix some of the annoying tests by supporting reading .bin sometimes
3178c3f0
ArthurZucker push
d6ab2505
ArthurZucker push more small fixes
0695197d
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into refac…
fd5a75a3
ArthurZucker remove 1 useless test
f54b5286
ArthurZucker up
1abf6a9b
ArthurZucker fix audio flamingo post rebase
30142909
ArthurZucker fixup
1f1bea3c
ArthurZucker some small updatess
c2dbca0e
ArthurZucker fix sam models
347b966a
ArthurZucker nits
40ed6364
ArthurZucker up
3b2f9342
ArthurZucker updates
fb0fb895
ArthurZucker onem ore
92e27714
ArthurZucker skip this stupid test
06f2ba9a
ArthurZucker some other fixes
3d5c86c7
ArthurZucker fixup
15bc48e8
ArthurZucker update
47743f86
ArthurZucker skip more offloaded stuff
d77cf579
ArthurZucker oups
75f2bd44
ArthurZucker ups
08ad69b5
ArthurZucker update mixtral
b605e1a3
ArthurZucker skip this one
91d40b87
ArthurZucker LET"SGO
638bbfca
github-actions
ArthurZucker fixup
7daacb43
ArthurZucker rope delta order
22c19a72
ArthurZucker fix csm
6d89354e
ArthurZucker small nit
9ccb6935
ArthurZucker ArthurZucker merged 6f6095e0 into main 105 days ago
ArthurZucker ArthurZucker deleted the refactor-weight-loading branch 105 days ago
xenova
ArthurZucker
MekkCyber
fxmarty-amd
fxmarty-amd commented on 2025-11-17
fxmarty-amd
fxmarty-amd commented on 2025-11-17
fxmarty-amd
fxmarty-amd
ArthurZucker
rkazants
ArthurZucker
IlyasMoutawwakil
ArthurZucker
3outeille
IlyasMoutawwakil

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone