Delete redundant device/dtype in TensorIterator add_input/add_output (#39798)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39798
add_input's device/dtype are 100% redundant, as compute_types will
always (internal) assert that this dtype matches the expected dtype.
add_output's device/dtype is redundant UNLESS you have an undefined
tensor (in which case it seems to be an indication what the output type
should be). The one add_output case I killed can never be exercised, see:
```
import torch
x = torch.randn(3, 4)
mask = x.ge(0.5)
torch.masked_select(x.cuda(), mask.cuda(), out=torch.zeros((0), dtype=torch.int64, device='cuda'))
```
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D21981742
Pulled By: ezyang
fbshipit-source-id: a042d1b9fce0ad58b833856ffe32001787551e59