fix device check in op bench (#30091)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30091
as title
Test Plan:
```
Before:
buck run mode/opt //caffe2/benchmarks/operator_benchmark/pt:unary_test -- --device cuda
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short
# Benchmarking PyTorch: abs
# Mode: Eager
# Name: abs_M512_N512_cpu
# Input: M: 512, N: 512, device: cpu
Forward Execution Time (us) : 91.190
# Benchmarking PyTorch: abs
# Mode: Eager
# Name: abs_M512_N512_cuda
# Input: M: 512, N: 512, device: cuda
Forward Execution Time (us) : 27.062
After:
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short
# Benchmarking PyTorch: abs
# Mode: Eager
# Name: abs_M512_N512_cuda
# Input: M: 512, N: 512, device: cuda
Forward Execution Time (us) : 28.154
# Benchmarking PyTorch: abs_
# Mode: Eager
# Name: abs__M512_N512_cuda
# Input: M: 512, N: 512, device: cuda
Forward Execution Time (us) : 15.959
...
Reviewed By: hl475
Differential Revision: D18595176
fbshipit-source-id: 048c5b7b2a5318c3687412e12e8d2d5f380a8139