DeepSpeed
f02d7bda - Fix verification for ZeRO3 leaf module (#5074)

Commit
1 year ago
Fix verification for ZeRO3 leaf module (#5074) This PR improves verification for ZeRO3 leaf module. The leaf module requires input tensors with `requires_grad=True` to launch reduce_scatter from backward hooks. Currently we throw an error if any of input tensors to the leaf module does not *require grad*. This prevents us from using leaf modules in some scenarios including inference or activation checkpointing as reported in #5008 This PR addresses the issue by checking output tensors as well as input tensors. the hook does not throw an error if no output tensor requires grad.
Author
Parents
Loading