Fix memory leak in zero2 contiguous gradients (#3306)
No usage of extra_large_param_to_reduce if contiguous_gradients is False.
It keeps reference of the param for the lifetime of the application.
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>