[fbcode] Use FastCat in PT Concat implementation (#106727)
Summary: Reimplement D48081898 and PR https://github.com/pytorch/pytorch/pull/106518 in fbcode first to accelerate the launching process
Test Plan:
All checks have been passed: https://github.com/pytorch/pytorch/actions/runs/5758987335/job/15612600466?pr=106518
(For my own learning purpose)
Check out OSS PyTorch repo and test following the instructions in https://www.internalfb.com/intern/wiki/PyTorch/PyTorchDev/Workflow/PyTorch_environment_setup/
and https://www.internalfb.com/intern/wiki/PyTorch/PyTorchDev/Workflow/PyTorch_environment_setup/oss_setup_on_devserver
:
```
pytest -k test_cat_out test/test_tensor_creation_ops.py -v -s
```
To submit to GitHub
```
hg amend; jf submit; ghexport
```
Differential Revision: D48082741
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106727
Approved by: https://github.com/ezyang, https://github.com/houseroad