per_channel fake quant fp16 and fp64 support (#56894)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56894
used the dispatch type macro to add support for fp16 and 64 tensors. haven't tested on gpu yet, will do so once I can rebuilt pytorch with cuda.
Test Plan:
python test/test_quantization.py TestFakeQuantize.test_forward_per_channel_half_precision_numerics
python test/test_quantization.py TestFakeQuantize
python test/test_quantization.py TestFakeQuantize.test_backward_per_channel_cachemask_cpu
python test/test_quantization.py TestFakeQuantize.test_forward_per_channel_cachemask_cpu
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D28002955
fbshipit-source-id: c9cf17aa0f15f163bfcc8e5ef7b329ca754924fd