fix handling of replica parameters in DataParallel (#33907)
Summary:
In DataParallel, replica parameters are not leaves (because they are computed via broadcast from master parameters), and should be treated as such. Fixes https://github.com/pytorch/pytorch/issues/33552
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33907
Differential Revision: D20150199
Pulled By: ngimel
fbshipit-source-id: 5965d4115b6b3a8433063126ff6269567872fbeb
Author
Natalia Gimelshein