[quant][qat] Ensure observer respects device affinity (#47514)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47514
Previosuly the scale and zero_point were returned on the CPU even if
the input tensor was on the GPU.
This is because `copy_()` doesn't respect the device when copying over the tensor.
Also fixed a bug where we were always setting the device to 'cuda' (irrespective of the device id)
in the calculate_qparams function
Test Plan:
python test/test_quantization.py TestObserver.test_observer_qparams_respects_device_affinity
Imported from OSS
Reviewed By: vkuzo
Differential Revision: D24800495
fbshipit-source-id: d7a76c59569842ed69029d0eb4fa9df63f87e28c