Add llama4 #37307

ArthurZucker merged 254 commits into main from add-llama4
ArthurZucker
ArthurZucker remove one of the last deps
9a75c63f
yonigozlan update fast image processor after refactor
e3c52a2f
ArthurZucker styling
1854fc90
ArthurZucker more quality of life improvements
660dc8c7
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
2defa9c7
ArthurZucker nit
0cf2e771
ArthurZucker update
693fc474
ArthurZucker cleanups
8da4b6e8
ArthurZucker some cleanups
ba7a8aad
ArthurZucker vllm updates
db2821e6
ArthurZucker update fake image token
6c04e10c
pcuenca [convert] Fix typo
5e9d84f3
pcuenca [convert] Strip extraneous bytes from shards
aa595de6
pcuenca [convert] Minor fixes
507857d7
pcuenca [convert] Use num_experts
d9e3f86a
molbap multi-image fixes in modeling + processor
5bebf978
molbap fixup size
671c37bd
pcuenca 128 experts
972c465e
pcuenca Use default rope
1be3ddc3
molbap Merge branch 'final-version' into fixes_cleanups
347a7620
pcuenca Unfuse mlp
b06a26b7
molbap simplify a lot inputs embeds merging
52787d5c
molbap Merge branch 'fixes_cleanups' of github.com:huggingface/new-model-add…
9c0ef18c
molbap remove .item() :eyes:
03e99395
molbap fix from review
ddf7adc2
molbap Merge pull request #5 from huggingface/fixes_cleanups
82004d95
pcuenca Merge branch 'final-version' into moe-128
ca0cd0ea
pcuenca Address feedback
54be1a01
pcuenca Use None "default" for rope_scaling. Add eot.
b38318d1
ArthurZucker set seed
ed00fb30
ArthurZucker Merge branch 'main' of github.com:huggingface/new-model-addition-meta…
b5373e20
youngkent return aspect ratios and bug fixes
fb748af7
liuzijing2014 Moe 128 rebased (#8)
189a1032
liuzijing2014 un-comment write_tokenizer from converting script
24d4599a
liuzijing2014 remove un-used imports
73520342
jmswen [llama4] Pop aspect_ratios from image processor output in Llama4Proce…
ca64ae50
jmswen Merge pull request #11 from huggingface/remove-aspect-ratios
3bf26c26
pcuenca Merge remote-tracking branch 'origin/final-version' into moe-128
4a1fec8d
pcuenca Fix parameter_count name
4af4c778
ArthurZucker Update src/transformers/models/llama4/configuration_llama4.py
b077bb5b
ArthurZucker Merge pull request #4 from huggingface/moe-128
c487c62d
ArthurZucker nit
55a17c58
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
90d58762
ArthurZucker Add changes for no_rope, moe_layers, chunked attention. Just need to …
e53363d1
ArthurZucker Update src/transformers/models/llama4/image_processing_llama4_fast.py
5b8dd838
ArthurZucker Merge pull request #13 from huggingface/meta_vllm
87abef5a
ArthurZucker nit
71385f16
ArthurZucker Merge branch 'main' of github.com:huggingface/new-model-addition-meta…
0c3f25a5
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
ec85fa38
ArthurZucker fix post merge with main
c358a1b4
ArthurZucker support flex attention
0c3dc0c0
ArthurZucker Merge branch 'final-version' into norope
1f4072b3
ArthurZucker fixes
d728d06f
MekkCyber fix
31d88f17
MekkCyber add layer
c338736b
ArthurZucker small updates
6529cade
MekkCyber rebase and delete llm_compressor
558c096d
MekkCyber nit
72517161
jmswen [llama4/mm] Add back <|image|> token that delimits global tile
5be1b28a
jmswen Merge pull request #16 from huggingface/global-tile
6f63da62
jmswen [llama4/mm] Fix Llama 4 image processing unit tests
f4f9fbce
jmswen add explicit dtype
2ad69a48
ArthurZucker sdpa works
0a9da1b5
jmswen Merge pull request #17 from huggingface/tests
21eb873c
MekkCyber Merge pull request #15 from huggingface/fix_quantization
4047e865
ArthurZucker comment todo small
6da9409f
liuzijing2014 fix model loading
233c7df8
ArthurZucker Merge pull request #18 from huggingface/meta/fix-model-loading
cd4a2dae
MekkCyber revert
fa75c349
ArthurZucker nits
9679739d
ArthurZucker Merge pull request #19 from huggingface/reverting_quantization_fix
eb677fa9
molbap small fix for TP on 1 node
b61c859c
pcuenca Read new params from config
822f2961
pcuenca Add <|eom|>
a417896d
pcuenca lol don't know how this got here
37391a3e
MekkCyber adding fp8
fe240a6a
pcuenca Save processor, fix chat template
ef31789f
pcuenca style
afcc7ec3
pcuenca Add boi/eoi tokens
ce5d1ea0
ArthurZucker fixes for now flex seems to work :)
da1e6910
ArthurZucker updates
7a2afb3d
ArthurZucker nits
85cf8b92
ArthurZucker updates
ab268fb8
MekkCyber missking keys
f418d062
ArthurZucker add context parallel
2133277b
ArthurZucker update
c29469ce
MekkCyber update
8b0a8c9f
ArthurZucker fix
e472a4ee
ArthurZucker nits
2f8d05bd
mht-sharma add worldsize and make eager attn work for vision
196d87ed
ArthurZucker Merge pull request #23 from huggingface/minor_tgi_fix
ef479fa1
pcuenca Ignore new key present in base models
12451706
MekkCyber add tp_plan
ddf89936
liuzijing2014 fix nope
b98cde83
liuzijing2014 minor fix
b25084be
ArthurZucker Merge pull request #26 from huggingface/meta/fix-nope
0f5b27ba
sarckk Clean up Llama4 vision model
99ec54bf
ArthurZucker Merge pull request #28 from huggingface/cleanup-mllama4
0a102524
ArthurZucker current updates
90e8e2c8
ArthurZucker add support for `attn_temperature_tuning`
5e87ba9c
ArthurZucker add floor scale
9e2e0f95
ArthurZucker add missing attn scales
5b1721bb
ArthurZucker push what works, dirty trick for the device synch
c06da80c
ArthurZucker oups
29f55d2b
pcuenca Fix pad_token_id
cf83f0b7
SunMarc fix causallml loading
06413dcd
SunMarc rm
ed6cba87
ArthurZucker Merge pull request #20 from huggingface/conversion-fixes
6d564d03
SunMarc fix tied-weights
ff1df035
molbap fix sdpa
6decf844
molbap Merge branch 'norope' of github.com:huggingface/new-model-addition-me…
ba2e4641
ArthurZucker Merge pull request #32 from huggingface/remove-warning
4eabf8f2
ArthurZucker push current version
7a001691
ArthurZucker Merge branch 'norope' of github.com:huggingface/new-model-addition-me…
a820dbe5
ArthurZucker should work with both short and long
24dbcad6
MekkCyber add compressed_tensos & fix fbgemm tp
f2bbb4ba
drisspg Fix flex impl
aeaad13a
ArthurZucker style
96066e09
Cyrilvallez chunking
eb535ee0
ArthurZucker Merge branch 'final-version' into norope
60a58cb7
ArthurZucker try to revert the potentially breaking change
e19af4b3
ArthurZucker fix auto factory
eb167f28
Cyrilvallez fix shapes in general
7f8941d2
SunMarc rm processing
30cacf70
ArthurZucker Merge pull request #30 from huggingface/fix-causal-lm-loading
99f2297e
ArthurZucker commit cache utils cleanup
7990c78f
pcuenca Fix context length
c7d4c883
MekkCyber fix
efb45772
MekkCyber Merge branch 'final-version' into add_fbgemm
9f9974b1
MekkCyber allocate
174eda3c
MekkCyber update tp_plan
bdfb5731
MekkCyber Merge pull request #21 from huggingface/add_fbgemm
aa8daba2
ArthurZucker fix SDPA!
05cc59e1
danieldk Add support for sparse `Llama4TextMoe` layer from the kernel hub
dcb29eb8
ArthurZucker cleanup
61626d0d
ArthurZucker better merge
373a472e
ArthurZucker Merge branch 'norope' of github.com:huggingface/new-model-addition-me…
d7d09a17
ArthurZucker update
64c2133d
ArthurZucker still broken fixing now
85b3c7ac
ArthurZucker nits
bfc8049c
ArthurZucker revert print
5da08327
pcuenca Write max_position_embeddings and max_model_length
bc44b2be
Cyrilvallez Update modeling_llama4.py
1a762675
pcuenca Save attention_chunk_size
fd0f2733
pcuenca Sync eos terminators
f03660ad
pcuenca Read initializer_range
3612b9cc
pcuenca style
f7818858
pcuenca remove `dict`
206c8aea
ArthurZucker fix
51f7cd24
ArthurZucker eager should use `chunked_attention_mask`
cb58ceac
ArthurZucker revert
74142354
ArthurZucker fixup
04b302a7
ArthurZucker Merge pull request #14 from huggingface/norope
a515579a
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
a9045fc9
ArthurZucker Merge pull request #36 from huggingface/sparse-llama4-moe
ccda19f0
ArthurZucker Merge branch 'final-version' into fix-context-length
598dded8
ArthurZucker Merge pull request #35 from huggingface/fix-context-length
ec7656a4
SunMarc fix config
fcee23da
LysandreJik Revert "Merge pull request #36 from huggingface/sparse-llama4-moe"
6ca6f66c
Cyrilvallez Fix typo and remove warning with compiled flex and chunked prefill
535030a0
pcuenca Fix MoE vs FF (#41)
a43e0561
MekkCyber fix
f5dd6fb7
sarckk Use correct no_rope_layers if provided one is empty list
7c03c7e0
yeqcharlotte Merge pull request #46 from huggingface/keep-nrope-layers-fix
6a8b9f62
MekkCyber update tests
7bda11f2
MekkCyber fix
e547b10b
MekkCyber skipping some tests
0130b2df
liuzijing2014 fix fp8 loading
93022de7
liuzijing2014 fix text geneartion pipeline
45cf5828
ArthurZucker eager needs 4D mask
a3e8267d
SunMarc fix
6ab06825
ArthurZucker Merge pull request #50 from huggingface/fix-eager
fd150bb7
LysandreJik Some cleanup
ef8dbe2b
LysandreJik fix
c38bf3a8
MekkCyber update
141da657
MekkCyber fix
66c36a47
SunMarc replace correctly module
9b2e35df
MekkCyber patch
ce91d95e
MekkCyber modulelist
2374ff71
MekkCyber update
61f45af6
MekkCyber update
a471b104
MekkCyber clean up
4c4bc81c
pcuenca Don't move to `cuda:0` in distributed mode
f642d32c
SunMarc restrict to compressed tensors for now
3d58f8e1
SunMarc rm print
8dbf7cb9
LysandreJik Docs!
48b4f563
LysandreJik Fixes
46b08156
LysandreJik Update docs/source/en/model_doc/llama4.md
0849d322
LysandreJik Fixes
f7756b4e
mht-sharma cuda graph fix
27364daf
ArthurZucker Merge pull request #38 from huggingface/smol-fix
b239675a
ArthurZucker Merge pull request #49 from huggingface/fix-quantization
aeec2dce
LysandreJik Merge pull request #53 from huggingface/l4-docs
8578252f
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
eb9e4afb
ArthurZucker revert some stuff
ad839d3c
ArthurZucker fixup
9f03f059
ArthurZucker styling
83282a19
ArthurZucker Merge pull request #44 from huggingface/fix_style
fb495fd9
ArthurZucker Merge pull request #54 from huggingface/fix-tp-pipeline
29028393
mht-sharma Update src/transformers/models/llama4/modeling_llama4.py
3eab4436
ArthurZucker Merge branch 'final-version' into code-quality
688dc5cf
ArthurZucker fixup
695c1e7f
ArthurZucker Merge branch 'code-quality' of github.com:huggingface/new-model-addit…
54785ef2
ArthurZucker commit licence, cleanup here and there and style
26b56748
ArthurZucker more styling changes
c53e2595
ArthurZucker Merge pull request #51 from huggingface/code-quality
f87c2378
ArthurZucker Merge pull request #55 from huggingface/tgi_cuda_graph_fix
7d5d5f0d
ArthurZucker fix dummies
1895d02c
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
931dad92
ArthurZucker fix and clean docstrings
ed669a34
ArthurZucker remove comment
7f292e1f
ArthurZucker Merge branch 'main' of github.com:huggingface/new-model-addition-meta…
b97451ea
ArthurZucker remove warning
34f6e9ef
LysandreJik Only fast image processor is supported
bac11b51
LysandreJik nit
d73aea8c
ydshieh trigger CI
ab8bbadc
ArthurZucker fix issue with flex encoder
6c6e9014
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
4994729f
ArthurZucker Merge pull request #58 from huggingface/only-fast-image-processor
5b96e5d2
ArthurZucker fix dynamic cache
5ce5746b
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
555c4eeb
LysandreJik Code quality
6ba8ef7f
LysandreJik Code quality
ecaa1a7b
ArthurZucker fix more tests for now
0c8624b2
LysandreJik Code quality
8167ac4c
LysandreJik Code quality
71521afb
ArthurZucker Nuke bunch of failing stuff
949b1b7e
ArthurZucker Merge branch 'final-version' of github.com:huggingface/new-model-addi…
b8786474
LysandreJik Code quality
cbb6e599
LysandreJik Code quality
8c509348
ArthurZucker cleanup removal of slow image processor
44a90c0f
ArthurZucker ruff fix fast image processor
99b6bc8f
LysandreJik fix
7c471ea7
ArthurZucker fix styling
538ba2b0
ArthurZucker git push Merge branch 'final-version' of github.com:huggingface/new-m…
50a8daab
github-actions github-actions marked this pull request as draft 1 year ago
github-actions
ArthurZucker ArthurZucker marked this pull request as ready for review 1 year ago
LysandreJik
LysandreJik approved these changes on 2025-04-05
ArthurZucker ArthurZucker added New model
ArthurZucker ArthurZucker added Tensor Parallel
LysandreJik Docs
07eaf8cc
LysandreJik Repo consistency
8b39d94f
LysandreJik Repo consistency
3736b900
ArthurZucker fix sliding window issue
92746533
ArthurZucker git push Merge branch 'add-llama4' of github.com:huggingface/transfor…
22a33e3f
ArthurZucker separate llama cache
748d6221
ArthurZucker styling
6a777c0b
LysandreJik Repo consistency
457f3c6a
LysandreJik Repo consistency
1226014c
ArthurZucker push waht works
ac54e8ff
ArthurZucker Merge branch 'add-llama4' of github.com:huggingface/transformers into…
69e94706
LysandreJik L4 Repo consistency
8f08b701
LysandreJik Docs
e9769f02
ArthurZucker fix last last alst alst alst alstsaltlsltlaslt
2ec5fbe4
ArthurZucker Merge branch 'add-llama4' of github.com:huggingface/transformers into…
9bfae248
ArthurZucker ArthurZucker merged 25b7f272 into main 1 year ago
ArthurZucker ArthurZucker deleted the add-llama4 branch 1 year ago
yeqcharlotte
HuggingFaceDocBuilderDev
kadirnar
ArthurZucker
nivibilla
ArthurZucker
ArthurZucker
ddh0
ArthurZucker
ArthurZucker
radoslav-dimitrov-indeavr
YenFuLin
YenFuLin commented on 2025-05-14
quantLm14
quantLm14 commented on 2025-06-02

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone