transformers
dfcb1a36 - fix(rope): read original_max_position_embeddings from yarn validator's argument (#45887)

Commit

1 day ago

fix(rope): read original_max_position_embeddings from yarn validator's argument (#45887) * fix(rope): read original_max_position_embeddings from yarn validator's argument `_validate_yarn_rope_parameters` is called by `validate_rope` once per per-attention-type sub-dict, with the sub-dict passed as the `rope_parameters` argument. The `factor` consistency check inside the function however reads `original_max_position_embeddings` from `self.rope_parameters[...]` instead of from the argument, which raises `KeyError` for any config that keeps the nested `{full_attention, sliding_attention, ...}` shape — the per-type sub-dicts are inside one of those keys, not at the top level. Other rope validators in the same file (`_validate_default_rope_parameters`, `_validate_linear_rope_parameters`, etc.) all read from the function argument, so this matches their pattern. * test(rope): mirror test_rope_validation for per-attention-type nested rope_parameters * test(rope): apply ruff format to nested-rope test --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>

References

#45887 - fix(rope): read original_max_position_embeddings from yarn validator's argument

Author

bzantium

Parents

a1b77cca

transformers dfcb1a36 - fix(rope): read original_max_position_embeddings from yarn validator's argument (#45887)

transformers
dfcb1a36 - fix(rope): read original_max_position_embeddings from yarn validator's argument (#45887)