[DTensor] Moved `Transformer` sharding to staticmethod (#121660)
To support FSDP + TP/SP unit tests, let us factor out the canonical TP/SP sharding of `Transformer` to a staticmethod that can be called by other unit tests.
Test Plan:
```
pytest test/distributed/tensor/parallel/test_tp_examples.py -k test_transformer_training
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121660
Approved by: https://github.com/wanchaol, https://github.com/yifuwang
ghstack dependencies: #121360, #121357