Fake: fix conv_transpose2d striding (#82846)
The output striding channels-last preservation logic differs between cuda and cpu. For the meta kernel, we can peek at the fake tensor device and use that to determine whether to do cpu or cuda.
You could argue there's a leaking of abstraction here but this seems like a pretty minimal leak and I'm not sure there's a much cleaner way forward for device-specific striding tracing logic.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82846
Approved by: https://github.com/ezyang