Fix deserialization of TransformerEncoderLayer (#81832) (#81832) (#82094)
Summary:
When `activation` is a module, it is not saved directly in the state dictionary but instead in `_modules`. When deserialized, the old version of this code would think that activation was missing and set it to RELU. This version first reconstructions the module and then sees if activation is neither a module nor a function before setting it to RELU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81832
Approved by: https://github.com/kit1980, https://github.com/zrphercule
Test Plan:
contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/e68583b4d180066b8e4f108e0d23176a2676421c
Test plan from GitHub:
pytorch oss tests
Reviewed By: jeanschmidt, zrphercule
Differential Revision: D38014872
Pulled By: zdevito
fbshipit-source-id: 938079d768f7981ca55eed3c8828b29a92e06f41
Co-authored-by: Zachary DeVito (Meta Employee) <zdevito@fb.com>