Improve docs on grad accumulation (#1817)
* Improve docs on grad accumulation
* Update docs/source/usage_guides/gradient_accumulation.md
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
* fix
* address feedback
* Update docs/source/usage_guides/gradient_accumulation.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Zach Mueller <muellerzr@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>