[CUDA Host Allocator][ROCm] fixes (#110715)
Follow up to #110123, removing the CUDA_VERSION check for ROCm because HIP already has hipMallocAsync() and doesn't need the version check there.
Follow up to #108488, fixing the unit failing unit tests by accepting either a "cuda" or "hip" attribute for the caching allocator options. This is aligned to the masquerading strategy for ROCm/HIP.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110715
Approved by: https://github.com/ezyang