pytorch
13f169c9 - Per Channel in brack-propagation function (#97475)

Commit

1 year ago

Per Channel in brack-propagation function (#97475) Summary: Supporting Per Channel quantization in the gradient computation function. One workaround that I have added here is Current QNNPACK is not designed to process [transposed weight](https://fb.workplace.com/groups/pytorch.edge.users/permalink/1283737025829921/) Here we are simply replacing Per Channel to Per Tensor to compute a gradient (Some slow learning curve or WER degradation might be expected - We don't know, nothing is guaranteed) Test Plan: You can create your own synthetic model, FP32 layer -> INT8 layer with Per Channel and see if loss is decreasing Differential Revision: D43898794 Pull Request resolved: https://github.com/pytorch/pytorch/pull/97475 Approved by: https://github.com/weiwangmeta

Author

kwanghoon-meta

Committer

pytorchmergebot

Parents

8e5f57a2

pytorch 13f169c9 - Per Channel in brack-propagation function (#97475)

pytorch
13f169c9 - Per Channel in brack-propagation function (#97475)