QuantizedCUDA implementation (#35463)

Commit

4 years ago

QuantizedCUDA implementation (#35463) Summary: Closes https://github.com/pytorch/pytorch/issues/30813 1. Tensor quantization logic(quantize_*) is moved to the aten/native/quantized. Previously all logic for tensor quantization lived in the aten/quantized/Quantizer.cpp file, and started to become complicated and hard to read. This problem should be addressed in refactoring PR. Still, I reworked this partially because I had to add tensor quantization logic for CUDA, and it was native to move everything to the aten/native/quantized. 2. Requirements to run CUDA_tensor_apply* was eased to process any tenser that lives on the CUDA device(QuantizedCUDA included). 3. All quantized data types now have a default constructor. NVCC refuses to compile any gpu_kernel or CUDA_tensor_apply* without them. 4. Minor changes in many files to register QuantizedCUDA backend. 5. test_quantized_tensor is extended to process QuantizedCUDA backend where possible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35463 Differential Revision: D20896697 Pulled By: jerryzh168 fbshipit-source-id: 163554efa23d11a2b10bbc2492439db4798eb26b

Author

Aleksandr Fedorov

Committer

facebook-github-bot

Parents

54ed6fd3

pytorch f6daa622 - QuantizedCUDA implementation (#35463)

Commit

pytorch
f6daa622 - QuantizedCUDA implementation (#35463)