Fix verification for ZeRO3 leaf module (#5074)
This PR improves verification for ZeRO3 leaf module.
The leaf module requires input tensors with `requires_grad=True` to
launch reduce_scatter from backward hooks. Currently we throw an error
if any of input tensors to the leaf module does not *require grad*.
This prevents us from using leaf modules in some scenarios including
inference or activation checkpointing as reported in #5008
This PR addresses the issue by checking output tensors as well as input
tensors. the hook does not throw an error if no output tensor requires
grad.