transformers
075dbbce - fix(trainer): Correct loss scaling for incomplete gradient accumulation steps (#39659)

Commit
167 days ago
fix(trainer): Correct loss scaling for incomplete gradient accumulation steps (#39659) * Fix issue[#38837]: wrong loss scaled in last step of epoch * chore: trigger CI * Update src/transformers/trainer.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> * Update src/transformers/modeling_flash_attention_utils.py Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com> --------- Co-authored-by: taihang <taihang@U-2RHYVWX7-2207.local> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Author
Parents
Loading