Preventing initialization of siglip's lecun_normal_, default_flax_embed_init in ZeRO3 #43574
Prevent redundant initialization in lecun_normal_ and default_flax_em…
64a4a5dd
jp1924
changed the title preventing duplicate initialization in siglip's `lecun_normal_`, `default_flax_embed_init` Preventing initialization of siglip's lecun_normal_, default_flax_embed_init in ZeRO3 45 days ago
apply style
05a1ec40
Merge branch 'main' into skip_siglip_init
8cc36421
fix check_repository_consistency
d4085f44
vasqu
commented
on 2026-01-29
Merge branch 'main' into skip_siglip_init
027d51c6
lecun_normal_ & default_flax_embed init > initialization.py
a16f544c
vasqu
approved these changes
on 2026-01-30
Update src/transformers/initialization.py
3db324e9
Merge branch 'main' into skip_siglip_init
9d21007b
Merge branch 'main' into skip_siglip_init
a5786045
Rename `_variance_scaling_` to `_variance_scaling` for consistency an…
98fb262e
Refactor initialization calls in `Phi4MultimodalVisionPreTrainedModel…
3f2ab1ef
Fix initialization calls in `Phi4MultimodalVisionPreTrainedModel`: up…
612d26a2
vasqu
approved these changes
on 2026-02-02
Add test for SigLIP model initialization with DeepSpeed ZeRO-3
04d6e5b6
Merge branch 'main' into skip_siglip_init
a6b6c755
vasqu
commented
on 2026-02-03
Update tests/deepspeed/test_deepspeed.py
7ecf4ee5
Update tests/deepspeed/test_deepspeed.py
96c253b2
Merge branch 'main' into skip_siglip_init
ba4dbcbb
Apply suggestion from @vasqu
ac1c57f0
fix: update embedding initialization function to use the correct suffix
2415b6dd
add test for variance scaling initialization with DeepSpeed ZeRO-3 in…
db68042c
small nits
68699e58
vasqu
approved these changes
on 2026-02-04
vasqu
enabled auto-merge (squash) 38 days ago
vasqu
merged
225254c1
into main 38 days ago
jp1924
deleted the skip_siglip_init branch 23 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub