Splitting embedding_bag to embedding_bag_forward_only and embedding_bag (#40557)
Summary:
Currently embedding_bag's CPU kernel queries whether weight.requires_grad() is true. This violates layering of AutoGrad and Op Kernels, causing issues in third-party backends like XLA. See this [issue](https://github.com/pytorch/xla/issues/2215) for more details.
This PR hoists the query of weight.requires_grad() to Python layer, and splits embedding_bag into two separate ops, each corresponding to weight.requires_grad() == true and false.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40557
Reviewed By: ailzhang
Differential Revision: D22327476
Pulled By: gmagogsfm
fbshipit-source-id: c815b3690d676a43098e12164517c5debec90fdc