[Vulkan] Enable copying QInt8 and QInt32 tensors from cpu to vulkan. (#90357)
Summary:
Copying QInt8 and QInt32 from cpu to vulkan:
- Added shader nchw_to_image_int8
- Added shader nchw_to_image_int32
Copying QInt8 and QInt32 from vulkan to cpu
Note: This functionality is currently disabled until issues on Android are resolved.
- Added shader image_to_nchw_int32
- QInt8 works with the same existing image_to_nchw_quantized shaders
Added multiple tests for each supported dtype:
- cpu_to_vulkan_and_dequantize:
These tests check the correctness of copying quantized cpu tensor to vulkan by comparing the output of the following:
- cpu float tensor -> quantize -> to vulkan -> dequantize -> to cpu
- cpu float tensor -> quantize -> dequantize
- cpu_to_vulkan_and_vulkan_to_cpu
(currently disabled until copying vulkan quantized to cpu is enabled):
These tests check the correctness of copying from cpu to vulkan and from vulkan to cpu by creating a random cpu float tensor, quantizing it, then copying it to vulkan, then back to cpu and comparing the output tensor to the original quantized tensor.
- quantize_per_tensor_and_vulkan_to_cpu
(currently disabled until copying vulkan quantized to cpu is enabled):
These tests check the correctness of copying quantized tensor from vulkan to cpu by comparing the output of the following:
- cpu float tensor -> to vulkan -> quantize -> to cpu
- cpu float tensor -> quantize
Test Plan:
On Mac
```
cd ~/fbsource
buck1 run -c pt.vulkan_full_precision=1 //xplat/caffe2:pt_vulkan_quantized_api_test_binAppleMac\#macosx-arm64
```
On Android
```
cd ~/fbsource
buck1 build -c ndk.custom_libcxx=false -c pt.enable_qpl=0 -c pt.vulkan_full_precision=1 //xplat/caffe2:pt_vulkan_quantized_api_test_binAndroid\#android-arm64 --show-output
adb push buck-out/gen/xplat/caffe2/pt_vulkan_quantized_api_test_binAndroid\#android-arm64 /data/local/tmp/vulkan_quantized_api_test
adb shell "/data/local/tmp/vulkan_quantized_api_test"
```
Reviewed By: kimishpatel
Differential Revision: D41654287
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90357
Approved by: https://github.com/SS-JIA