SemanticDiff

pytorch
36fb14b6 - [quant] Add Graph Mode Passes to quantize EmbeddingBag operators (#41612)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

4 years ago

[quant] Add Graph Mode Passes to quantize EmbeddingBag operators (#41612) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41612 This change adds preliminary support to quantize the EmbeddingBag operators. We currently support 4-bit and 8-bit quantization+packing of the weights. To quantize these operators, specify the operator name in the `custom_op_name` field of the NoopObserver. Based on the op name (4bit or 8bit) we call the corresponding quantization functions. Refer to the testplan for how to invoke the qconfig for the embedding_bag ops. Future versions of this will support 4-bit and 2-bit qtensors with native support to observe and quantize it. NB - This version assumes that the weights in the EmbeddingBag Module reside on the same device. Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Imported from OSS Reviewed By: vkuzo, jerryzh168 Differential Revision: D22609342 fbshipit-source-id: 23e33f44a451c26719e6e283e87fbf09b584c0e6

Author

supriyar

supriyar

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading