transformers
dca93ca0 - Fix step shifting when accumulate gradient (#33673)

Commit
1 year ago
Fix step shifting when accumulate gradient (#33673) * replace total_batched_samples with step while counting grad accum step * remove unused variable * simplify condition for update step * fix format by ruff * simplify update step condition using accelerator.sync_gradients * simplify update condition using do_sync_step * remove print for test --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Author
Parents
Loading