DeepSpeed
c66bc426 - set the default to use set_to_none for clearing gradients in BF16 optimizer. (#5434)

Commit
1 year ago
set the default to use set_to_none for clearing gradients in BF16 optimizer. (#5434) as discussed in #5175, set the default to use set_to_none for clearing gradients in BF16 optimizer. Additionally, for the case of zero clearing, use foreach_zero. Verified correctness with mega-ds llama 7B training. FYI @loadams --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Author
Parents
Loading