Support batched embeddings for 8Bit embedding bag quantization (#55343)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55343
Add support for N-dimensioned batches of 2D embedding bags to qembeddingbag_byte_prepack and qembeddingbag_byte_unpack.
This is currently supported in C2 via caffe2::Fused8BitRowwiseQuantizedToFloat and caffe2::FloatToFused8BitRowwiseQuantized, but is being supported in PyTorch operators via this change.
Test Plan: buck test //caffe2/test:quantization -- test_embedding_bag_byte
Reviewed By: radkris-git
Differential Revision: D27480917
fbshipit-source-id: 9878751c6cee8a55909fe58a3e8c222ea31c20bb