Fix CUDA Stream synchronization when arguments contains RRefs (#57394)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57394
Test Plan: Imported from OSS
Reviewed By: lw
Differential Revision: D28131325
Pulled By: mrshenli
fbshipit-source-id: 7174942d4c8dabe13f8eb1ba7fea599922a022c0