pytorch
a840afbe - [pytorch][embeddingbag_8bit] Add include_last_offset option to Fused 8bit EmbeddingBag and parallelize the op (#32683)

Commit
4 years ago
[pytorch][embeddingbag_8bit] Add include_last_offset option to Fused 8bit EmbeddingBag and parallelize the op (#32683) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32683 Pull Request resolved: https://github.com/pytorch/glow/pull/4079 Similar to D17768404, we changed the EmbeddingBag operator for 8-bit fused version to add the option to include the last offset and parallelize the op. ghstack-source-id: 97404645 Test Plan: To generate the AVX2 code (`embedding_lookup_fused_8bit_rowwise_idx_avx2.cc`): ``` python hp_emblookup_codegen.py --fused --use-offsets ``` To test the correctness: ``` buck test //caffe2/torch/fb/sparsenn:test -- test_embedding_bag_byte_rowwise_offsets --print-passing-details ``` Reviewed By: yinghai Differential Revision: D19592761 fbshipit-source-id: f009d675ea3f2228f62e9f86b7ccb94700a0dfe0
Author
Parents
Loading