uninitialize output and bag_size in the fast path of EmbeddingBag to save overhead (#36681)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36681
Test Plan:
Imported from OSS
Unit tests:
python test/run_test.py -i test_nn -- TestNNDeviceTypeCPU.test_EmbeddingBag_per_sample_weights_failures_cpu
python test/run_test.py -i test_nn -- TestNNDeviceTypeCPU.test_EmbeddingBag_per_sample_weights_and_offsets_cpu
python test/run_test.py -i test_nn -- TestNNDeviceTypeCPU.test_EmbeddingBag_per_sample_weights_and_new_offsets_cpu
python test/run_test.py -i test_nn -- TestNNDeviceTypeCPU.test_EmbeddingBag_per_sample_weights_and_no_offsets_cpu
python test/test_nn.py TestNN.test_embeddingbag_from_pretrained
python test/test_nn.py TestNN.test_embeddingbag_from_pretrained_options
Finally run: python test/test_nn.py
Reviewed By: jspark1105
Differential Revision: D21058006
Pulled By: xing-liu
fbshipit-source-id: 65b36a788839e8b722db3e295e58215b5935d6e8