Fix device_option propagation (#25203)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25203
device_option propagation is completely broken in Caffe2 for cases when pass
through operators are used. As an example Gather operator don't have gradient
and passes through it's inputs, which results in incorrect detection of the
components for sparse parameter aggregation (component will be empty instead of
the real device).
This diff is trying to fix this issue.
Test Plan:
net_transform is finally working with Gather + FloatToHalf transformed model
instead of failing because of incorrect number of components.
Reviewed By: dzhulgakov
Differential Revision: D16936041
fbshipit-source-id: 916551b933469f04e32ddf86ec4b2c07f76c9176