transformers
d9050dc7 - [LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112)

Commit
3 years ago
[LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112) * [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing * LED docs clarification Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] gradient_checkpointing=True should be passed to TrainingArguments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] docs: remove wrong word Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] docs fix typo Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Author
Parents
  • docs/source/en/model_doc
    • File
      led.mdx
  • src/transformers/models/led
    • File
      modeling_led.py