Allow TransformerEncoder and TransformerDecoder to accept 0-dim batch sized tensors. (#62800)
Summary:
This issue fixes a part of https://github.com/pytorch/pytorch/issues/12013, which is summarized concretely in https://github.com/pytorch/pytorch/issues/38115.
This PR allows TransformerEncoder and Decoder (alongwith the inner `Layer` classes) to accept inputs with 0-dimensional batch sizes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62800
Reviewed By: VitalyFedyunin
Differential Revision: D30303240
Pulled By: jbschlosser
fbshipit-source-id: 8f8082a6f2a9f9d7ce0b22a942d286d5db62bd12