DeepSpeed
01d17492 - Fix memory leak in zero2 contiguous gradients (#3306)

Commit
2 years ago
Fix memory leak in zero2 contiguous gradients (#3306) No usage of extra_large_param_to_reduce if contiguous_gradients is False. It keeps reference of the param for the lifetime of the application. Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Author
Parents
Loading