Synchronize before change cuda stream (#82050) (#82056)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/82050
Need synchronize before change cuda stream
### Description
<!-- What did you change and why was it needed? -->
### Issue
<!-- Link to Issue ticket or RFP -->
### Testing
<!-- How did you test your change? -->
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82056
Approved by: https://github.com/ngimel