transformers
2a939f20 - Substantially reduce memory usage in _update_causal_mask for large batches by using .expand instead of .repeat [needs tests+sanity check] (#29413)

Commit

1 year ago

Substantially reduce memory usage in _update_causal_mask for large batches by using .expand instead of .repeat [needs tests+sanity check] (#29413) * try to fix gemma mem use * fix: handle attention mask dim==2 case * remove logits=logits.float() * clean up + add llama * apply formatting * readability edit: swap order of items being multiplied * revert change unrelated to PR * revert black autoformat * switch to one .to * Accept style edits Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

References

#29413 - Substantially reduce memory usage in _update_causal_mask for large batches by using .expand instead of .repeat

#29969 - [SigLIP] Add fast tokenizer

#32831 - [Docs] Update resources

#33111 - [Backbone] Remove out_features everywhere

#33174 - [Zero-shot image classification pipeline] Remove tokenizer_kwargs

#39821 - Support MetaCLIP 2

#59 - Fix attention mask handling in EoMT-DINOv3 converter

#62 - Add initial DEIMv2 model implementation

#65 - Fix RTDetrV2 sine position embedding ordering

#43729 - [Doc tests] Fix bug

Author

nqgl

Parents

965cf677

transformers 2a939f20 - Substantially reduce memory usage in _update_causal_mask for large batches by using .expand instead of .repeat [needs tests+sanity check] (#29413)

transformers
2a939f20 - Substantially reduce memory usage in _update_causal_mask for large batches by using .expand instead of .repeat [needs tests+sanity check] (#29413)