Use parallel thrust execution policy on ROCm (#15481)
Summary:
The Thrust shipped with ROCm is recent enough to support this API. Minimize divergence between CUDA/ROCm by changing idef guards.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15481
Differential Revision: D13598739
Pulled By: bddppq
fbshipit-source-id: 20d0a7e3887a4050eea65033161561af47411de1