[PyTorch] Use native serial stack when there is only 1 thread (#76399)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76399
Does not make sense to hit the non-serial kernel if intra op parallelism is off.
ghstack-source-id: 154934866
Test Plan: Existing unit tests
Reviewed By: tenpercent
Differential Revision: D35946012
fbshipit-source-id: 9ae12e267826b4da96b0a105c34260bc4ac91135
(cherry picked from commit f61eb10a754cab5491ec972b72aa5e92e0638e50)