accelerate
cd570b2e - reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. (#1926)

Commit
2 years ago
reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. (#1926) * reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * update acceleartor.reduce and accelerate.utils.operations.reduce --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Author
Parents
Loading