[PyTorch] Reapply D25544731: Avoid extra Tensor refcounting in _cat_out_cpu (#49760)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49760
This was reverted because it landed in a stack together with
D25542799 (https://github.com/pytorch/pytorch/commit/9ce1df079f6ea90dd4b7f9aa12a1a78d51a8b204), which really was broken.
ghstack-source-id: 119361028
Test Plan: CI
Reviewed By: bwasti
Differential Revision: D25685789
fbshipit-source-id: 41e5abb4ff30acaa6f33f9c806acd652a6dd9646