Fix attribution of some CUDA events to CPU events (#51632)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51632
Some fixes:
- attribute CUDA Runtime events to proper PyTorch CPU events
- make sure we don't accidentally attribute some CUDA kernels to the
CUDA Runtime events that have semantically different ids
- minor fixes in the output
Test Plan:
CI
https://gist.github.com/ilia-cher/0e78d0440fe02b77ff6721571c14f01c
https://gist.github.com/ilia-cher/8f655cf15beb1b11547fd3564a1c3958
Reviewed By: gdankel
Differential Revision: D26222734
Pulled By: ilia-cher
fbshipit-source-id: 13571dbeea0222ee1a531edacd1f4153f1e38da3