Fix Cuda IPC deadlock (#40347)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40347
Fixes: #39541
Fixes: #25301
Differential Revision: D22152662
Test Plan: Imported from OSS
Pulled By: VitalyFedyunin
fbshipit-source-id: 82548aa4c937e0260932244e78cb132bcb3209b3