SemanticDiff

pytorch
98af01ee - [quant] Make FakeQuant use REGISTER_DISPATCH (#33682)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

4 years ago

[quant] Make FakeQuant use REGISTER_DISPATCH (#33682) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33682 Previously, there were two API's for CPU and CUDA. This change keeps one top level API, i.e `fake_quantize_per_tensor_affine` and `fake_quantize_per_channel_affine` and uses the device type to dispatch to different backends (CPU and CUDA). CPU kernel implementation is in QuantizedOpKernels.cpp CUDA kernel implementation is in fake_quantize_core.cu Test Plan: python test/test_fake_quant.py Benchmark Results for CPU FakeQuantize tensor of size (2, 256, 128, 128) Before: per tensor quant ms 9.905877113342285 per channel quant ms 74.93825674057007 After: per tensor quant ms 6.028120517730713 per channel quant ms 44.91588592529297 Imported from OSS Differential Revision: D20072656 fbshipit-source-id: 0424f763775f88b93380a452e3d6dd0c90cb814b

Author

supriyar

supriyar

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading