[pytorch] add Vulkan support for the `t` and `transpose` operators for 2d, 3d and 4d tensors (#101808)
Summary:
Use the existing permute shader to implement the following two operators for Vulkan backend
- `aten::transpose` The behavior of the operator is shown in https://pytorch.org/docs/stable/generated/torch.transpose.html.
- `aten::t` The behavior of the operator is shown in https://pytorch.org/docs/stable/generated/torch.t.html#torch.t. 1d tensors are returned as is. When input is a 2d tensor this is equivalent to `aten::transpose(input, 0, 1)`.
Test Plan:
At local repo of fbsource on MacBook, run `buck run --target-platforms ovr_config//platform/macos:arm64-fbsource //xplat/caffe2:pt_vulkan_api_test_binAppleMac\#macosx-arm64 -c pt.vulkan_full_precision=1`
- Full test results P739033174.
- `aten::t` and `aten::tranpose` related results shown below
```
(base) luwei@luwei-mbp fbsource % buck run --target-platforms ovr_config//platform/macos:arm64-fbsource //xplat/caffe2:pt_vulkan_api_test_binAppleMac\#macosx-arm64 -c pt.vulkan_full_precision=1
[... other tests ...]
[ RUN ] VulkanAPITest.transpose_t_1d
[ OK ] VulkanAPITest.transpose_t_1d (0 ms)
[ RUN ] VulkanAPITest.transpose_t_2d_small
[ OK ] VulkanAPITest.transpose_t_2d_small (1 ms)
[ RUN ] VulkanAPITest.transpose_t_2d_medium
[ OK ] VulkanAPITest.transpose_t_2d_medium (0 ms)
[ RUN ] VulkanAPITest.transpose_t_2d_large
[ OK ] VulkanAPITest.transpose_t_2d_large (0 ms)
[ RUN ] VulkanAPITest.transpose_2d_height_and_width_small
[ OK ] VulkanAPITest.transpose_2d_height_and_width_small (0 ms)
[ RUN ] VulkanAPITest.transpose_2d_height_and_width_medium
[ OK ] VulkanAPITest.transpose_2d_height_and_width_medium (0 ms)
[ RUN ] VulkanAPITest.transpose_2d_height_and_width_large
[ OK ] VulkanAPITest.transpose_2d_height_and_width_large (0 ms)
[ RUN ] VulkanAPITest.transpose_2d_height_and_height_large
[ OK ] VulkanAPITest.transpose_2d_height_and_height_large (0 ms)
[ RUN ] VulkanAPITest.transpose_2d_width_and_width_large
[ OK ] VulkanAPITest.transpose_2d_width_and_width_large (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_height_and_width_small
[ OK ] VulkanAPITest.transpose_3d_height_and_width_small (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_height_and_width_medium
[ OK ] VulkanAPITest.transpose_3d_height_and_width_medium (1 ms)
[ RUN ] VulkanAPITest.transpose_3d_height_and_width_large
[ OK ] VulkanAPITest.transpose_3d_height_and_width_large (1 ms)
[ RUN ] VulkanAPITest.transpose_3d_width_and_width_large
[ OK ] VulkanAPITest.transpose_3d_width_and_width_large (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_depth_and_width_small
[ OK ] VulkanAPITest.transpose_3d_depth_and_width_small (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_depth_and_width_medium
[ OK ] VulkanAPITest.transpose_3d_depth_and_width_medium (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_depth_and_width_large
[ OK ] VulkanAPITest.transpose_3d_depth_and_width_large (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_depth_and_depth_large
[ OK ] VulkanAPITest.transpose_3d_depth_and_depth_large (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_depth_and_height_small
[ OK ] VulkanAPITest.transpose_3d_depth_and_height_small (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_depth_and_height_medium
[ OK ] VulkanAPITest.transpose_3d_depth_and_height_medium (0 ms)
[ RUN ] VulkanAPITest.transpose_3d_depth_and_height_large
[ OK ] VulkanAPITest.transpose_3d_depth_and_height_large (2 ms)
[ RUN ] VulkanAPITest.transpose_3d_height_and_height_large
[ OK ] VulkanAPITest.transpose_3d_height_and_height_large (1 ms)
[ RUN ] VulkanAPITest.transpose_4d_batch_and_batch_large
[ OK ] VulkanAPITest.transpose_4d_batch_and_batch_large (1 ms)
[ RUN ] VulkanAPITest.transpose_4d_depth_and_depth_large
[ OK ] VulkanAPITest.transpose_4d_depth_and_depth_large (1 ms)
[ RUN ] VulkanAPITest.transpose_4d_height_and_height_large
[ OK ] VulkanAPITest.transpose_4d_height_and_height_large (1 ms)
[ RUN ] VulkanAPITest.transpose_4d_width_and_width_large
[ OK ] VulkanAPITest.transpose_4d_width_and_width_large (0 ms)
[ RUN ] VulkanAPITest.transpose_4d_batch_and_depth_large
[ OK ] VulkanAPITest.transpose_4d_batch_and_depth_large (1 ms)
[ RUN ] VulkanAPITest.transpose_4d_batch_and_height_large
[ OK ] VulkanAPITest.transpose_4d_batch_and_height_large (2 ms)
[ RUN ] VulkanAPITest.transpose_4d_batch_and_width_large
[ OK ] VulkanAPITest.transpose_4d_batch_and_width_large (2 ms)
[ RUN ] VulkanAPITest.transpose_4d_depth_and_height_large
[ OK ] VulkanAPITest.transpose_4d_depth_and_height_large (2 ms)
[ RUN ] VulkanAPITest.transpose_4d_depth_and_width_large
[ OK ] VulkanAPITest.transpose_4d_depth_and_width_large (2 ms)
[ RUN ] VulkanAPITest.transpose_4d_height_and_width_large
[ OK ] VulkanAPITest.transpose_4d_height_and_width_large (1 ms)
[... other tests ...]
```
Reviewed By: SS-JIA
Differential Revision: D45878333
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101808
Approved by: https://github.com/SS-JIA