[JIT] Dont include view ops in autodiff graphs (#42027)
Summary:
View ops as outputs of differentiable subgraphs can cause incorrect differentiation. For now, do not include them in the subgraph. This was observed with our autograd tests for MultiheadAttention and nn.Transformer, which currently fail with the legacy executor. This commit fixes those test failures.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42027
Reviewed By: pbelevich
Differential Revision: D22798133
Pulled By: eellison
fbshipit-source-id: 2f6c08953317bbe013933c6faaad20100376c039