transformers
Preventing initialization of siglip's lecun_normal_, default_flax_embed_init in ZeRO3
#43574
Merged

Preventing initialization of siglip's lecun_normal_, default_flax_embed_init in ZeRO3 #43574

vasqu merged 21 commits into huggingface:main from jp1924:skip_siglip_init
jp1924
jp1924 Prevent redundant initialization in lecun_normal_ and default_flax_em…
64a4a5dd
jp1924 jp1924 changed the title preventing duplicate initialization in siglip's `lecun_normal_`, `default_flax_embed_init` Preventing initialization of siglip's lecun_normal_, default_flax_embed_init in ZeRO3 45 days ago
jp1924 apply style
05a1ec40
jp1924 Merge branch 'main' into skip_siglip_init
8cc36421
jp1924 fix check_repository_consistency
d4085f44
vasqu
vasqu commented on 2026-01-29
jp1924 Merge branch 'main' into skip_siglip_init
027d51c6
jp1924 lecun_normal_ & default_flax_embed init > initialization.py
a16f544c
vasqu
vasqu approved these changes on 2026-01-30
vasqu
jp1924 Update src/transformers/initialization.py
3db324e9
jp1924
jp1924 Merge branch 'main' into skip_siglip_init
9d21007b
jp1924 Merge branch 'main' into skip_siglip_init
a5786045
jp1924 Rename `_variance_scaling_` to `_variance_scaling` for consistency an…
98fb262e
jp1924 Refactor initialization calls in `Phi4MultimodalVisionPreTrainedModel…
3f2ab1ef
jp1924 Fix initialization calls in `Phi4MultimodalVisionPreTrainedModel`: up…
612d26a2
vasqu
vasqu approved these changes on 2026-02-02
jp1924 Add test for SigLIP model initialization with DeepSpeed ZeRO-3
04d6e5b6
jp1924 Merge branch 'main' into skip_siglip_init
a6b6c755
jp1924
jp1924 jp1924 requested a review from vasqu vasqu 40 days ago
vasqu
vasqu commented on 2026-02-03
jp1924 Update tests/deepspeed/test_deepspeed.py
7ecf4ee5
jp1924 Update tests/deepspeed/test_deepspeed.py
96c253b2
jp1924 Merge branch 'main' into skip_siglip_init
ba4dbcbb
jp1924 Apply suggestion from @vasqu
ac1c57f0
jp1924 fix: update embedding initialization function to use the correct suffix
2415b6dd
jp1924 add test for variance scaling initialization with DeepSpeed ZeRO-3 in…
db68042c
jp1924
jp1924 jp1924 requested a review from vasqu vasqu 39 days ago
vasqu small nits
68699e58
vasqu
vasqu approved these changes on 2026-02-04
vasqu vasqu enabled auto-merge (squash) 38 days ago
github-actions
vasqu vasqu merged 225254c1 into main 38 days ago
HuggingFaceDocBuilderDev
jp1924
jp1924 jp1924 deleted the skip_siglip_init branch 23 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone