[NCCL] Add option to run NCCL on high priority cuda stream (#43796)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43796
This diff adds an option for the process group NCCL backend to pick high priority cuda streams.
Test Plan: waitforsandcastle
Reviewed By: jiayisuse
Differential Revision: D23404286
fbshipit-source-id: b79ae097b7cd945a26e8ba1dd13ad3147ac790eb