transformers
d9050dc7 - [LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112)

Commit

3 years ago

[LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112) * [LED] fixed global_attention_mask not passed for generation + docs clarification for gradient checkpointing * LED docs clarification Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] gradient_checkpointing=True should be passed to TrainingArguments Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] docs: remove wrong word Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * [LED] docs fix typo Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

References

#27720 - Add common processor tests

#32831 - [Docs] Update resources

#29969 - [SigLIP] Add fast tokenizer

#33111 - [Backbone] Remove out_features everywhere

#33174 - [Zero-shot image classification pipeline] Remove tokenizer_kwargs

#19449 - [WIP] Fix weights initialization of several vision models

#17112 - [LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing

Author

caesar-one

Parents

bad35839

Files2

docs/source/en/model_doc
- led.mdx
src/transformers/models/led
- modeling_led.py

transformers d9050dc7 - [LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112)

transformers
d9050dc7 - [LED] fix global_attention_mask not being passed for generation and docs clarification about grad checkpointing (#17112)