transformers
Add GPT OSS model from OpenAI
#39923
Merged

Commits
  • fix
    qgallouedec committed 235 days ago
  • nice
    qgallouedec committed 235 days ago
  • where i am at
    ArthurZucker committed 235 days ago
  • Bro this works
    ArthurZucker committed 235 days ago
  • Merge pull request #16 from huggingface/fix-attention
    qgallouedec committed 235 days ago
  • Update src/transformers/integrations/tensor_parallel.py
    ArthurZucker committed 234 days ago
  • Merge pull request #11 from huggingface/tp_embed_parallel
    ArthurZucker committed 234 days ago
  • cleanups
    ArthurZucker committed 234 days ago
  • Merge branch 'add-oai' into add-fast-flash-kernel
    ArthurZucker committed 234 days ago
  • yups that was breaking
    ArthurZucker committed 234 days ago
  • Merge branch 'add-fast-flash-kernel' of github.com:huggingface/new-model-addition-openai into add-fast-flash-kernel
    ArthurZucker committed 234 days ago
  • Merge pull request #15 from huggingface/add-fast-flash-kernel
    ArthurZucker committed 234 days ago
  • Merge branch 'main' of github.com:huggingface/new-model-addition-openai into add-oai
    ArthurZucker committed 234 days ago
  • Update src/transformers/models/openai_moe/modeling_openai_moe.py
    ArthurZucker committed 234 days ago
  • merge
    ArthurZucker committed 234 days ago
  • gather on experts and not mlp
    SunMarc committed 228 days ago
  • add changes for latest convert branch
    edbeeching committed 227 days ago
  • adds options to get output_router_logits from config
    edbeeching committed 227 days ago
  • bring chat temlate + special tokens back into the script.
    Vaibhavs10 committed 223 days ago
  • Merge pull request #22 from huggingface/vb/special-tok
    ArthurZucker committed 223 days ago
  • Merge pull request #21 from huggingface/ed-fix-modeling
    ArthurZucker committed 223 days ago
  • initial commmit
    MekkCyber committed 223 days ago
  • update
    MekkCyber committed 223 days ago
  • working with shards
    MekkCyber committed 223 days ago
  • add model.safetensors.index.json
    MekkCyber committed 223 days ago
  • fix
    MekkCyber committed 223 days ago
  • fix
    MekkCyber committed 223 days ago
  • mxfp4 flag
    MekkCyber committed 223 days ago
  • rm print
    MekkCyber committed 223 days ago
  • Fix PAD/EOS/BOS (#18)
    SunMarc committed 223 days ago
  • add some doc
    MekkCyber committed 222 days ago
  • Merge pull request #23 from huggingface/update_conversion_script
    SunMarc committed 222 days ago
  • special tokens based on harmony.
    Vaibhavs10 committed 222 days ago
  • add in tokenizer config as well.
    Vaibhavs10 committed 222 days ago
  • Merge pull request #25 from huggingface/vb/upd-conversion-script
    Vaibhavs10 committed 222 days ago
  • prepare for rebase with main
    ArthurZucker committed 221 days ago
  • Merge branches 'add-oai' and 'add-oai' of github.com:huggingface/new-model-addition-openai into add-oai
    ArthurZucker committed 221 days ago
  • merge with main
    ArthurZucker committed 221 days ago
  • Fix for initialize_tensor_parallelism now returning 4-tuple
    edbeeching committed 221 days ago
  • mxfp4
    MekkCyber committed 221 days ago
  • mxfp4 draft
    MekkCyber committed 221 days ago
  • fix
    MekkCyber committed 221 days ago
  • fix import
    MekkCyber committed 221 days ago
  • draft
    MekkCyber committed 221 days ago
  • draft impl
    MekkCyber committed 221 days ago
  • finally working !
    MekkCyber committed 221 days ago
  • simplify
    MekkCyber committed 221 days ago
  • add import
    MekkCyber committed 221 days ago
  • working version
    MekkCyber committed 221 days ago
  • consider blocks and scales
    MekkCyber committed 221 days ago
  • device mesh fix
    MekkCyber committed 221 days ago
  • initial commit
    MekkCyber committed 221 days ago
  • add working dequant + quant logic
    MekkCyber committed 221 days ago
  • update
    MekkCyber committed 221 days ago
  • non nan, gibberish output
    MekkCyber committed 221 days ago
  • working EP + quantization finally !
    MekkCyber committed 221 days ago
  • start cleaning
    MekkCyber committed 221 days ago
  • remove reversing process
    MekkCyber committed 221 days ago
  • style
    MekkCyber committed 221 days ago
  • some cleaning
    MekkCyber committed 221 days ago
  • initial commmit
    MekkCyber committed 221 days ago
  • more cleaning
    MekkCyber committed 221 days ago
  • more cleaning
    MekkCyber committed 221 days ago
  • simplify
    MekkCyber committed 221 days ago
  • more cleaning
    MekkCyber committed 221 days ago
  • rm duplicated function
    MekkCyber committed 221 days ago
  • changing tp_plan
    MekkCyber committed 221 days ago
  • update tp plan check
    MekkCyber committed 221 days ago
  • add loading attribute
    MekkCyber committed 221 days ago
  • dequantizing logic
    MekkCyber committed 221 days ago
  • use subfunctions
    MekkCyber committed 221 days ago
  • import cleaning
    MekkCyber committed 221 days ago
  • update_param_name
    MekkCyber committed 221 days ago
  • adds clamped swiglu
    edbeeching committed 221 days ago
  • add clamping to training path
    edbeeching committed 217 days ago
  • simplify dequant logic
    MekkCyber committed 217 days ago
  • Merge branch 'main' of github.com:huggingface/new-model-addition-openai into add-oai
    ArthurZucker committed 217 days ago
  • update
    ArthurZucker committed 217 days ago
  • Merge branch 'add-oai' of github.com:huggingface/new-model-addition-openai into add-oai
    ArthurZucker committed 217 days ago
  • Bad merge
    ArthurZucker committed 217 days ago
  • more simplifications & tests
    MekkCyber committed 217 days ago
  • fix !
    ArthurZucker committed 217 days ago
  • Merge pull request #26 from huggingface/add-clamp-swiglu
    ArthurZucker committed 217 days ago
  • fix registering custom attention
    ArthurZucker committed 217 days ago
  • fix order
    MekkCyber committed 216 days ago
  • fixes
    MekkCyber committed 216 days ago
  • some test nits
    MekkCyber committed 216 days ago
  • nits
    MekkCyber committed 216 days ago
  • nit
    MekkCyber committed 216 days ago
  • Merge branch 'add-oai' into adding_packing_format_option
    MekkCyber committed 216 days ago
  • Merge pull request #20 from huggingface/adding_packing_format_option
    MekkCyber committed 216 days ago
  • fix
    MekkCyber committed 216 days ago
  • Merge pull request #27 from huggingface/guard_kernels_imports
    SunMarc committed 216 days ago
  • Clamp sink logits
    lewtun committed 215 days ago
  • Clean
    lewtun committed 215 days ago
  • Soft-max trick
    lewtun committed 215 days ago
  • Clean up
    lewtun committed 215 days ago
  • p
    lewtun committed 215 days ago
  • Merge pull request #28 from huggingface/fix-train-bsz
    ArthurZucker committed 215 days ago
  • fix deepspeed
    MekkCyber committed 215 days ago
  • + more commits ...
Loading