[inductor][triton] if device is a torch.device, then make cuda_properties index it correctly (#87174)
Without this, I was running into obvious `KeyError`s that were assuming that the device was an integer when running `examples/imagenet`.
```python
(pytorch) soumith@bluebox:~/code/examples/imagenet$ python main.py --gpu 0 /home/soumith/dataset/imagenet
/home/soumith/code/vision/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
/home/soumith/code/examples/imagenet/main.py:100: UserWarning: You have chosen a specific GPU. This will completely disable data parallelism.
warnings.warn('You have chosen a specific GPU. This will completely '
Use GPU: 0 for training
=> creating model 'resnet18'
make_fallback(aten.unfold): a decomposition exists, we should switch to it
make_fallback(aten.unfold_backward): a decomposition exists, we should switch to it
Traceback (most recent call last):
File "/home/soumith/code/pytorch/torch/_inductor/graph.py", line 254, in call_function
return lowerings[target](*args, **kwargs)
File "/home/soumith/code/pytorch/torch/_inductor/lowering.py", line 202, in wrapped
return decomp_fn(*args, **kwargs)
File "/home/soumith/code/pytorch/torch/_inductor/lowering.py", line 2994, in var_
diffs = square(sub(x, mean(x, axis, keepdim=True)))
File "/home/soumith/code/pytorch/torch/_inductor/lowering.py", line 202, in wrapped
return decomp_fn(*args, **kwargs)
File "/home/soumith/code/pytorch/torch/_inductor/lowering.py", line 2983, in mean
sum_result = sum_(x, axis, keepdim)
File "/home/soumith/code/pytorch/torch/_inductor/lowering.py", line 202, in wrapped
return decomp_fn(*args, **kwargs)
File "/home/soumith/code/pytorch/torch/_inductor/lowering.py", line 3211, in sum_
return fn(x, axis, keepdims, dtype=dtype)
File "/home/soumith/code/pytorch/torch/_inductor/lowering.py", line 2953, in inner
result = Reduction.create(
File "/home/soumith/code/pytorch/torch/_inductor/ir.py", line 714, in create
hint, split = cls.num_splits(
File "/home/soumith/code/pytorch/torch/_inductor/ir.py", line 454, in num_splits
num_sm = get_device_properties(device).multi_processor_count
File "/home/soumith/code/pytorch/torch/_inductor/cuda_properties.py", line 43, in get_device_properties
return _properties()[_device(device)]
KeyError: device(type='cuda', index=0)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87174
Approved by: https://github.com/yf225