[PyTorch] Support norm_first in nn.TransformerEncoderLayer fast path (#78269)
Straightforward after previous diffs in stack cleaning up code and adding test coverage.
Differential Revision: [D36564008](https://our.internmc.facebook.com/intern/diff/D36564008/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78269
Approved by: https://github.com/jbschlosser