Enable correct supported activities for kineto on rocm (#88207)
A compile time guard was preventing ActivityType::CUDA from being available on rocm. This caused both the GPU_FALLBACK and CUDA modes to be active at the same time. So operators were being charged gpu time for the hipEventRecord ranges and the actual kernel execution times. This caused incorrect (and often negative) cuda times, in e.g. table().
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88207
Approved by: https://github.com/malfet, https://github.com/jeffdaily