Add LN after specialzied output embeddings and flexible LCE (#35178)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35178
* add layer norm (LN) after specialized output embeddings
* add flexible lce inside specialized module
Test Plan:
* unit-tests
* buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_4 --
* buck test caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test_6 --
* workflows
* flexible lce: f177025325
{F232112501}
* LN: f177025301
{F232112982}
Differential Revision: D20586281
fbshipit-source-id: 664e77cb4cb5bec6646cafd2e4afb88aff27df03