Fix release of IPG buffer (#7376)
#6993 broke many paths in ZeRO1/2 optimizer. This PR fixes most of the
issues the PR caused. Currently we still have one error with tests in
`unit/runtime/zero`.
```
====================================== short test summary info ======================================
FAILED test_zero.py::TestParamPartitioningSkipInit::test[dtype1] - RuntimeError: mat1 and mat2 must have the same dtype, but got Half and BFloat16
========= 1 failed, 204 passed, 66 skipped, 15 deselected, 5 warnings in 2305.03s (0:38:25) =========
```
---------
Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com>