accelerate
cd570b2e - reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. (#1926)

Commit

2 years ago

reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. (#1926) * reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * update acceleartor.reduce and accelerate.utils.operations.reduce --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>

References

#1926 - reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP.

Author

ji-huazhong

Parents

727d6243

accelerate cd570b2e - reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. (#1926)

accelerate
cd570b2e - reduce gradient first for XLA when unscaling the gradients in mixed precision training with AMP. (#1926)