pytorch
2c761caa - [Vulkan] cat operator for channel dimension (#66669)

Commit
3 years ago
[Vulkan] cat operator for channel dimension (#66669) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66669 Implemented `cat` operator for channel dimension **Facts:** * texture coordinate: x(width), y(height), z(depth) * input x, y, z -> no change * out x, y -> no change * out z and index i, j only matter **Equations:** batch_size = bt0 (or bt1 or bt2 or ...) = # of batch for tensor i ch_size = ch0 (or ch1 or ch2 or ...) = # of channels for tensor i ch_interval = ch0 + ch1 + ch2 + ... = total # of channels for all tensors ch_size_allprior = ch0 (or ch0+ch1 or ch0+ch1+ch2 or ...) = # of channels for tensor 0 to i-1 where pos.z = d (input) i = index of input texel = vec4[i] of texel at posIn(x,y,z) on input texture j = index of output texel = vec4[j] of texel at posOut(x',y',z') on input texture posIn[i] = {x,y,z} at ith index of vec4 src_index = posIn.z * 4 + i dst_index = int(src_index / ch_size) * ch_interval + (src_index % ch_size) + ch_size_allprior d = posOut.z = int(dst_index / 4) j = (dst_index % 4) posOut[j] = {posIn.x, posIn.y, d} at jth index of vec4 **Shader pseudo code:** posOut = posIn; for (i = 0; i < 4; ++i) { src_index = posIn.z * 4 + i; if (src_index >= ch_size * batch_size) break; // out of range dst_index = int(src_index / ch_size) * ch_interval + (src_index % ch_size) + ch_size_allprior; posOut.z = int(dst_index / 4); j = (dst_index % 4); uOutput[j] = uInput[i] } Test Plan: Test build on Android: ``` cd ~/fbsource buck build -c ndk.custom_libcxx=false -c pt.enable_qpl=0 //xplat/caffe2:pt_vulkan_api_test_binAndroid\#android-arm64 --show-output adb push buck-out/gen/xplat/caffe2/pt_vulkan_api_test_binAndroid\#android-arm64 /data/local/tmp/vulkan_api_test adb shell "/data/local/tmp/vulkan_api_test" ``` Test result: ``` [ RUN ] VulkanAPITest.cat_dim1_samefeature_success [ OK ] VulkanAPITest.cat_dim1_samefeature_success (101 ms) [ RUN ] VulkanAPITest.cat_dim1_difffeature_success [ OK ] VulkanAPITest.cat_dim1_difffeature_success (81 ms) [ RUN ] VulkanAPITest.cat_dim1_texture2d_success [ OK ] VulkanAPITest.cat_dim1_texture2d_success (2 ms) [ RUN ] VulkanAPITest.cat_dim1_singledepth_success [ OK ] VulkanAPITest.cat_dim1_singledepth_success (6 ms) [ RUN ] VulkanAPITest.cat_dim1_singletensor_success [ OK ] VulkanAPITest.cat_dim1_singletensor_success (21 ms) [ RUN ] VulkanAPITest.cat_dim1_twotensors_success [ OK ] VulkanAPITest.cat_dim1_twotensors_success (53 ms) [ RUN ] VulkanAPITest.cat_dim1_bat1_ch4multiple_success [ OK ] VulkanAPITest.cat_dim1_bat1_ch4multiple_success (17 ms) [ RUN ] VulkanAPITest.cat_dim2_sameheight_success [ OK ] VulkanAPITest.cat_dim2_sameheight_success (83 ms) [ RUN ] VulkanAPITest.cat_dim2_diffheight_success [ OK ] VulkanAPITest.cat_dim2_diffheight_success (86 ms) [ RUN ] VulkanAPITest.cat_dim2_singledepth_success [ OK ] VulkanAPITest.cat_dim2_singledepth_success (5 ms) [ RUN ] VulkanAPITest.cat_dim2_invalidinputs_exceptions [ OK ] VulkanAPITest.cat_dim2_invalidinputs_exceptions (82 ms) ``` Reviewed By: SS-JIA Differential Revision: D31593623 fbshipit-source-id: e52dc57985e3f0bb9b20313d4fcc7248a436e863
Author
Parents
Loading