transformers
Add GPT OSS model from OpenAI
#39923
Merged

Add GPT OSS model from OpenAI #39923

ArthurZucker merged 421 commits into main from add-oai
ArthurZucker
qgallouedec fix
546efee6
qgallouedec nice
7b077aee
ArthurZucker where i am at
ebcae9a9
ArthurZucker Bro this works
528b3c85
qgallouedec Merge pull request #16 from huggingface/fix-attention
9c61a8cd
ArthurZucker Update src/transformers/integrations/tensor_parallel.py
297e47e2
ArthurZucker Merge pull request #11 from huggingface/tp_embed_parallel
2f852e27
ArthurZucker cleanups
3d25cf75
ArthurZucker Merge branch 'add-oai' into add-fast-flash-kernel
ff0544bf
ArthurZucker yups that was breaking
29454d2e
ArthurZucker Merge branch 'add-fast-flash-kernel' of github.com:huggingface/new-mo…
b3582fc8
ArthurZucker Merge pull request #15 from huggingface/add-fast-flash-kernel
f33a74d6
ArthurZucker Merge branch 'main' of github.com:huggingface/new-model-addition-open…
1f3ae2b3
ArthurZucker Update src/transformers/models/openai_moe/modeling_openai_moe.py
15c85e0e
ArthurZucker merge
0c7379ae
SunMarc gather on experts and not mlp
ad0fc38f
edbeeching add changes for latest convert branch
4fb73451
edbeeching adds options to get output_router_logits from config
968238ca
Vaibhavs10 bring chat temlate + special tokens back into the script.
4bc55572
ArthurZucker Merge pull request #22 from huggingface/vb/special-tok
68fd8339
ArthurZucker Merge pull request #21 from huggingface/ed-fix-modeling
410435a2
MekkCyber initial commmit
07bd34d4
MekkCyber update
b7987d2e
MekkCyber working with shards
2c0fd4d3
MekkCyber add model.safetensors.index.json
1d03f3ac
MekkCyber fix
40e379d1
MekkCyber fix
b68aa6b4
MekkCyber mxfp4 flag
a87db4f4
MekkCyber rm print
c3c01f07
qgallouedec Fix PAD/EOS/BOS (#18)
863630d9
MekkCyber add some doc
eab251f7
SunMarc Merge pull request #23 from huggingface/update_conversion_script
928b9b6c
Vaibhavs10 special tokens based on harmony.
9280e590
Vaibhavs10 add in tokenizer config as well.
b382c5e0
Vaibhavs10 Merge pull request #25 from huggingface/vb/upd-conversion-script
7cdd0be9
ArthurZucker prepare for rebase with main
f8f3e40a
ArthurZucker Merge branches 'add-oai' and 'add-oai' of github.com:huggingface/new-…
c9dc8f29
ArthurZucker merge with main
0ce752c6
edbeeching Fix for initialize_tensor_parallelism now returning 4-tuple
60af8419
SunMarc mxfp4
1ce172b4
SunMarc mxfp4 draft
c0bee222
SunMarc fix
fe896d36
SunMarc fix import
174147df
SunMarc draft
b8215ddd
SunMarc draft impl
62f77e17
SunMarc finally working !
6e9d0c72
SunMarc simplify
6b8b279f
SunMarc add import
ea5c364a
SunMarc working version
1175ab46
SunMarc consider blocks and scales
d53cb49e
SunMarc device mesh fix
8c43631f
MekkCyber initial commit
4f515ebc
MekkCyber add working dequant + quant logic
0ff67272
MekkCyber update
13cb07b0
MekkCyber non nan, gibberish output
39888563
MekkCyber working EP + quantization finally !
b9c8138b
MekkCyber start cleaning
5117d71e
MekkCyber remove reversing process
3733a349
MekkCyber style
65873596
MekkCyber some cleaning
79610731
MekkCyber initial commmit
0de006a2
MekkCyber more cleaning
12a9e802
MekkCyber more cleaning
39047834
MekkCyber simplify
75e0f21a
MekkCyber more cleaning
c8ce0473
MekkCyber rm duplicated function
8b162f70
MekkCyber changing tp_plan
8a00f600
MekkCyber update tp plan check
d760f30c
MekkCyber add loading attribute
b34570e7
MekkCyber dequantizing logic
a4950aa6
MekkCyber use subfunctions
89b06710
MekkCyber import cleaning
7bfdca61
MekkCyber update_param_name
21872bd0
edbeeching adds clamped swiglu
b68ece87
edbeeching add clamping to training path
3e106d62
MekkCyber simplify dequant logic
1716e6d8
ArthurZucker Merge branch 'main' of github.com:huggingface/new-model-addition-open…
f49bcbb9
ArthurZucker update
b8b00238
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
6400fb29
ArthurZucker Bad merge
69761698
MekkCyber more simplifications & tests
195cca63
ArthurZucker fix !
345afb13
ArthurZucker Merge pull request #26 from huggingface/add-clamp-swiglu
7b18304a
ArthurZucker fix registering custom attention
009355a6
MekkCyber fix order
d237a90c
MekkCyber fixes
ccffc0b9
MekkCyber some test nits
f92878af
MekkCyber nits
90522c41
MekkCyber nit
dbb8b20a
MekkCyber Merge branch 'add-oai' into adding_packing_format_option
d5634bda
MekkCyber Merge pull request #20 from huggingface/adding_packing_format_option
587d8dae
MekkCyber fix
edd92321
SunMarc Merge pull request #27 from huggingface/guard_kernels_imports
c0ef1563
lewtun Clamp sink logits
dc2b16fe
lewtun Clean
b0508307
lewtun Soft-max trick
e0e406ec
lewtun Clean up
54e88254
lewtun p
0378ae86
ArthurZucker Merge pull request #28 from huggingface/fix-train-bsz
a2089800
MekkCyber fix deepspeed
077cfeef
ArthurZucker update both modeling and modular for cleanup
bec11b79
MekkCyber contiguous
7d8ac2ed
ArthurZucker update tests
42ab1088
ArthurZucker fix top_k router call
e9f130a5
ArthurZucker revert renaming
da77d5e3
ArthurZucker test nits
5b0bd402
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
9af87b2b
ArthurZucker small fixes for EP
b43d2cd4
ArthurZucker fix path for our local tests
13ec4ef3
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
0b5a0e97
ArthurZucker update as I should not have broken that!
0276225a
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
f1cf9519
ArthurZucker fix the loss of mixtral
a34b39ca
ArthurZucker revert part of the changes related to router_scores, kernel probably …
e7cc5914
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
b7a9e4aa
ArthurZucker deleting a small nit
f1245b4c
ArthurZucker Merge branches 'add-oai' and 'add-oai' of github.com:huggingface/new-…
8a6fbf9b
SunMarc update arch
9b387ca9
MekkCyber fix post processing
6c0effa9
SunMarc update
ab0f9295
Vaibhavs10 Merge pull request #30 from huggingface/fix-conversion-architecture
e030193d
SunMarc running version but not expected output
c80bd448
SunMarc Merge pull request #29 from huggingface/fix_ds
6c55b12a
SunMarc Merge remote-tracking branch 'origin/add-oai' into update-triton-kernels
740f3aa3
MekkCyber moving to cuda
dc125183
MekkCyber initial commit
20dfa56d
MekkCyber revert
228a9826
MekkCyber erroring when loading on cpu
5a597336
MekkCyber updates
910ccfec
MekkCyber del blocks, scales
212acd0f
SunMarc fix
5c6d3b2c
SunMarc style
5ec240fc
SunMarc rm comm
2faa7ca4
MekkCyber comment
c5b8cecd
SunMarc add comment
79dd4fc1
SunMarc Merge pull request #36 from huggingface/default_to_dequantize_training
93f0816d
SunMarc Merge branch 'add-oai' into update-triton-kernels
c5e7bfcb
SunMarc style
d238ea4e
SunMarc Merge pull request #31 from huggingface/update-triton-kernels
76f90886
SunMarc remove duplicated lines
a7dd97fd
SunMarc Fix minor issue with weight_map conversion script
cf4843b4
zhuohan123 fix sampling params
8b7a73f2
ArthurZucker rename to final name
08b031b7
pcuenca Merge branch 'add-oai' into zhuohan/fix-sampling-parmsl
a39ebae3
Vaibhavs10 Merge pull request #37 from huggingface/zhuohan/fix-sampling-parmsl
8430860a
Vaibhavs10 upate pre-final version of template
0d1a2da4
pcuenca Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py
5f3de46c
Vaibhavs10 Merge pull request #38 from huggingface/vb/upd-template
ce4e9129
MekkCyber fix batched inference
bddc8c2a
MekkCyber Merge pull request #39 from huggingface/fix_batched_inference
b2b1ca50
serve fixes
06b35eb5
SunMarc swizzle !
0de8f627
SunMarc Merge branch 'add-oai' into swizzle
a29c5a2d
Vaibhavs10 update final chat template by Matt.
aca1e72b
fix responses; pin oai
a8c3c493
SunMarc sinplify
33636c91
Vaibhavs10 Thanks Matt for his tireless efforts!
af6fb990
gante `transformer serve` fixes for oai (mostly hide CoT)
22e8236f
Vaibhavs10 Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py
6f91a55a
SunMarc fix
afe89129
SunMarc Merge pull request #42 from huggingface/swizzle
b7dc08c1
Vaibhavs10 Merge pull request #41 from huggingface/vb/up-template-2
e991ef4c
ahadnagy Use ROCm kernels from HUB
7e540fc3
ahadnagy Make kernel modes explicit
3e4ad36a
ahadnagy Merge pull request #43 from huggingface/rocm-kernels-support
fa6eee9c
Vaibhavs10 update final chat template by Matt. x2
e946804b
Vaibhavs10 Thanks Matt for his tireless efforts!
1a8728d6
Vaibhavs10 Merge pull request #44 from huggingface/vb/up-template-3
f3225067
lewtun Fix installation
50b82506
lewtun Update setup.py
dec98d80
qgallouedec allow no content
0c6f911d
qgallouedec fix: update message handling in write_tokenizer function
181c625a
lewtun Merge pull request #45 from huggingface/fix-install
fa7a66dd
qgallouedec Fix template logic for user message role
7c741230
lewtun Merge pull request #47 from huggingface/fix-chat-template
672bc172
ArthurZucker Merge branch 'main' of github.com:huggingface/new-model-addition-open…
402976da
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
5509620c
ArthurZucker last nits for CB and flash_paged!
9d27880c
ArthurZucker there was one bad merge
4cf6186b
ArthurZucker fix CB (hardcode for now, its just using kv groups instead)
cac4c098
MekkCyber fix
eeef8c8d
SunMarc better fix for device_map
45fbc185
SunMarc Merge pull request #48 from huggingface/fix_target_device
92a2a498
SunMarc minor device fix
6dd3a723
ArthurZucker Fix flash paged
5ef7f3f4
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
47ae152a
ArthurZucker updates
d2303c71
ArthurZucker Revert "remove dtensors, not explicit (#39840)"
ed511f21
lewtun Merge pull request #46 from huggingface/fix-tool-chat-template
d8092b99
ArthurZucker update
e9b3708e
ArthurZucker Revert "remove dtensors, not explicit (#39840)"
70750d9a
ArthurZucker fix merge
35576899
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
fbc68154
MekkCyber fix
b939303b
qgallouedec Fix line break when custom model indentity
d238182f
MekkCyber Merge pull request #49 from huggingface/fix_import_triton_kernels
7c364da1
ArthurZucker nits testing
088a6070
ArthurZucker to locals first and pass sliding window to flash paged
d91814b5
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
b392bc5e
ArthurZucker register modes for MegaBlocksMoeMlp
27bd828d
ArthurZucker add integration test in fixtures -> now update the tests to use it!
b667b7c1
ArthurZucker update integration tests
afffd581
MekkCyber initial fix
00d6703c
ArthurZucker style and update tests
6a8710ec
MekkCyber fix
4cb0a93a
MekkCyber Merge pull request #53 from huggingface/fix_warning
b6965318
MekkCyber Merge pull request #52 from huggingface/fix_kernels
a9b7b399
tengomucho chore(gpt oss): remove mlp_bias from configuration
b9f34dd6
SunMarc stats
eb942a6b
LysandreJik Integration tests
94a85f0a
LysandreJik whoops
210067a3
LysandreJik Shouldn't move model
e60807a7
LysandreJik Merge pull request #57 from huggingface/add-oai-integration-test-fixes
2718a7c9
Vaibhavs10 Merge pull request #50 from huggingface/fix-line-break
093ffd56
Rocketknight1 Ensure assistant messages without thinking always go to "final" channel
c954ef7d
Rocketknight1 More checks to ensure expected format
13f67567
ArthurZucker Merge pull request #54 from huggingface/remove-mlp_bias
6ef5c342
qgallouedec Add pad_token_id to model configuration in write_model function (#51)
bee0515d
LysandreJik Add oai fix fast tests (#59)
e1f46b45
Rocketknight1 Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py
e29f6590
Rocketknight1 Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py
5c6255ec
Rocketknight1 Update src/transformers/models/gpt_oss/convert_gpt_oss_weights_to_hf.py
889fe011
Rocketknight1 Merge pull request #58 from huggingface/update-template
25e8bd81
Vaibhavs10 reasoning -> Reasoning
9844308a
Vaibhavs10 Merge pull request #61 from huggingface/vb/upd-chat-temp-reasoning
563b5cf6
LysandreJik Add additional integration tests
b222c6ff
LysandreJik fixup
84210542
LysandreJik Slight fixes
60017719
qgallouedec align chat template with harmony
e360f176
qgallouedec simplify
5fe06b9e
LysandreJik Add comment
ba792c9d
LysandreJik torch testing assert close
afc0fc49
LysandreJik torch testing assert close
7bddb91b
LysandreJik torch testing assert close
4068437d
LysandreJik torch testing assert close
94f11c59
LysandreJik torch testing assert close
3660b2b3
LysandreJik torch testing assert close
974987fa
SunMarc Merge pull request #56 from huggingface/better-stats
768b5821
LysandreJik Revert fixup
d881a200
LysandreJik Merge pull request #62 from huggingface/add-new-integration-tests
0c7db230
ArthurZucker skip 2 test remove todo
66980045
ArthurZucker Merge branch 'add-oai' of github.com:huggingface/new-model-addition-o…
208b83c1
ArthurZucker merge
54cf55fa
ArthurZucker padding side should be left for integration tests
f19e04b9
ArthurZucker fix modular wrt to changes made to modeling
1f7cad06
ArthurZucker style
6973ba40
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-oai
9ab58975
ArthurZucker isort
1f47841b
ArthurZucker fix opies for the loss
865b368b
ArthurZucker mmmm
75f13d05
LysandreJik
LysandreJik approved these changes on 2025-08-05
ArthurZucker ArthurZucker merged 7c38d8fc into main 134 days ago
ArthurZucker ArthurZucker deleted the add-oai branch 134 days ago
ArthurZucker ArthurZucker added New model
ArthurZucker ArthurZucker added Model Parallel
ArthurZucker ArthurZucker added Mixture of Experts
ArthurZucker ArthurZucker added Flash Attention
github-actions
leonardtang

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone