Add `NoChunk` wrapper for pipeline args. (#57325)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57325
As per the design outlined in
https://github.com/pytorch/pytorch/issues/53952, adding a `NoChunk` wrapper for
pipeline parallelism inputs.
If a Tensor is wrapped with this wrapper, the pipeline implementation does not
split this Tensor across micro-batches and instead just replicates this tensor
as-is similar to non-tensors.
ghstack-source-id: 132009305
Test Plan:
1) unit tests.
2) waitforbuildbot.
Reviewed By: SciPioneer
Differential Revision: D28109277
fbshipit-source-id: ee78c814c715d207d2796aba40b756a8e1834898