adds sync to flaky test_events_multi_gpu_query (#26231)
Summary:
This test can sometimes fail in CI.
I suspect this flakiness is because the test asks a CUDA stream to record an event, fails to synchronize the CPU with that stream, then checks if the event is recorded on the CPU. There is no guarantee this will have happened.
This one-line change preserves the intent of the test while ensuring the GPU has recorded the event before the CPU queries it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26231
Differential Revision: D17382110
Pulled By: mruberry
fbshipit-source-id: 35b701f87f41c24b208aafde48bf10e1a54de059