Make structured functions properly check device/dtype of explicit out args (#55150)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55150
Somehow I forgot to add these checks. Now they're in here. Thanks
ngimel for noticing.
This is probably a slight efficiency hit on TensorIterator, which is
probably already doing all these checks. Would be good to follow up
on this, though it may not be easily fixable with the TI rewrite.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: zhangguanheng66
Differential Revision: D27523879
Pulled By: ezyang
fbshipit-source-id: 458e617dbc6de6fcfa9e5841148b30b99f52e001