transformers
93dd4fb9 - Add Solar-Open Model (#43244)

Commit
2 days ago
Add Solar-Open Model (#43244) * feat: implement solar-open-100b * feat: update modeling_solar_open.py * feat: update solar-open config * chore: apply style * feat: remove _tied_weights_keys * feat: update modeling code * chore: remove speech_to_text_2 in modeling * docs: solar_open model * test: solar open model * chore: re-convert modular * fix: remove require_read_token * Apply suggestion from @vasqu Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * chore: update lincse year -> 2026 * feat: add solar_open to tokenizer mapping * chore: update license year * test: remove _torch_compile_train_cls * docs: update solar_open doc * refactor: simplify SolarOpenDecoderLayer * refactor: inherit Glm4MoeConfig class * fix: handle head_dim properly * chore: apply style * fix: default parameters * test: use tiny dummy model * update expectations and switch to eager moe (no fluctuations per grouped_mm / batched_mm) * chore: remove trust_remote_code (suggestion from @vasqu) Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * Update src/transformers/models/solar_open/modular_solar_open.py Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * chore: update config docstring * chore: add partial_rotary_factor workaround comment * test: check default config values in test_modeling_solar_open.py * fix: config class interface * docs: add SolarOpen to doctree * docs: update dates * Revert "feat: add solar_open to tokenizer mapping" This reverts commit 038b1c1f691df3c5b4ac38d8ee5be6c07dcbb67f. * feat: remove unnecessary configs * test: update SolarOpenConfig tests * fix: attention_dropout issue on training * Revert "feat: remove unnecessary configs" This reverts commit 9023688db77be07f11f99ea7578bca8c6e7258e7. * Revert "fix: attention_dropout issue on training" This reverts commit 3c275dcbb8210422437d4891b9aa67452297dc20. * Revert "Revert "feat: remove unnecessary configs"" This reverts commit e6adcd9db9727b281af410123688c60a84fc86b1. * Revert "Revert "fix: attention_dropout issue on training"" This reverts commit 573fa9aec111a7444a08016a79897e26a9d84258. * feat: inherit attention from Llama * fix: remove del for attention_bias and attention_dropout * chore: convert solar_open * fix date --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> Co-authored-by: vasqu <antonprogamer@gmail.com>
Author
Parents
Loading