Additional ops for ShardedTensor, ReplicatedTensor and PartialTensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76477
Adding the following ops:
1) softmax for ShardedTensor
2) getitem and unsqueeze for ReplicatedTensor
3) transpose and cat for PartialTensor
Differential Revision: [D35979510](https://our.internmc.facebook.com/intern/diff/D35979510/)
Approved by: https://github.com/fduwjj, https://github.com/wanchaol