Enable CPU profiling when device='cuda' and profiling turned on (#1101)
Summary:
Pull Request resolved: https://github.com/pytorch/benchmark/pull/1101
When profiling CUDA models, I think we should collect both CUDA and CPU activity. Without CPU activity profiling turned on, we can't see which CPU events are responsible for launching the cuda kernels.
Test Plan: Imported from OSS
Reviewed By: aaronenyeshi, xuzhao9
Differential Revision: D38638497
Pulled By: davidberard98
fbshipit-source-id: ccc033047ff6d583bf938845263692c66c473876