transformers
864db665 - Add ForSequenceClassification heads for the OLMo family (#45551)

Commit
18 days ago
Add ForSequenceClassification heads for the OLMo family (#45551) * Add Olmo/Olmo2/Olmo3 ForSequenceClassification Adds sequence-classification heads to the OLMo family so `AutoModelForSequenceClassification.from_pretrained("allenai/OLMo-2-0425-1B")` (and the Olmo/Olmo3 equivalents) work out of the box. Implementation follows the canonical modular-inheritance pattern used by Gemma/Gemma2, Qwen2/Qwen3, and Glm/Glm4: a single hand-written subclass in `modular_olmo.py` cascades trivially to Olmo2 and Olmo3 via the modular tooling, which resolves to the `GenericForSequenceClassification` mixin. Also registers the three classes in `MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING_NAMES` and adds autodoc entries to each model's doc page. Coordination: https://github.com/huggingface/transformers/issues/45529 Maintainer approval: @Rocketknight1 ("This is welcome! ... happy for it to be mostly AI-written. Just ping me on the PR for review when it's ready!") AI assistance: yes, per issue #45529. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add sequence-classification test coverage for Olmo family For Olmo and Olmo2 (older `ModelTesterMixin` pattern), adds the new class to `all_model_classes` and wires up `text-classification` + `zero-shot` in `pipeline_model_mapping`, so standard forward/gradient tests run against the classification head. For Olmo3 (newer `CausalLMModelTester` pattern), sets `sequence_classification_class = Olmo3ForSequenceClassification` on the model tester, which auto-enables `test_sequence_classification_model`, `test_sequence_classification_model_for_single_label`, and `test_sequence_classification_model_for_multi_label` from the base class. Local verification on MPS: 413 non-TP tests pass; Olmo3's three classification tests pass specifically. TP tests (`test_tp_*`) are deselected on MPS hardware — CUDA-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
Parents
Loading