[profiler][small] CUDA synchronize guard, minor fix (#58254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58254
Don't use CUDA synchronize when profiling in CPU only mode.
minor fixes (a clarification for a doc string, fix spammy logging)
(Note: this ignores all push blocking failures!)
Test Plan: manual + CI
Reviewed By: gdankel, chaekit
Differential Revision: D28423667
Pulled By: ilia-cher
fbshipit-source-id: 04c71727f528ae8e2e0ff90e88271608d291bc69