Fix torch.cat() performance regression on single core CPU (#33534)
Summary:
This PR addresses the performance regression on `torch.cat()` on CPU with single thread.
Previous optimization https://github.com/pytorch/pytorch/issues/30806 introduced regression for several cases on pytorch operator benchmark.
See https://github.com/pytorch/pytorch/issues/33334 for detail.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33534
Differential Revision: D20129963
Pulled By: VitalyFedyunin
fbshipit-source-id: 3fa6cd266978e5b54fa37105555502b77352df3e