Fix Adafactor documentation (recommend correct settings) (#10526)
* Update optimization.py
Fix documentation to reflect optimal settings for Adafactor
* update and expand on the recommendations
* style
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* flip scale_parameter to True for the 2nd recommendatoin
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>