Only sync CUDA if the operation is run on GPU (#80328)
This fixes test failures when PyTorch is build without CUDA
Fixes https://github.com/pytorch/pytorch/issues/58563
I used the same is_cuda check that is used in test_nn.py
CC @ailzhang after #58564
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80328
Approved by: https://github.com/mruberry