[Vulkan] Fix divide-by-zero with padded tensors (#97698)
Summary:
This fixes the divide-by-zero that arises when performing a division in which the denominator has a number of channels that isn't a multiple of 4, and therefore the channel dimension has been padded with 0s.
More details in the comments of this post: https://fb.workplace.com/groups/pytorch.edge.users/permalink/1288546972015593/
Test Plan:
```
buck run --target-platforms ovr_config//platform/macos:arm64-fbsource -c pt.vulkan_full_precision=1 //xplat/caffe2:pt_vulkan_api_test_binAppleMac\#macosx-arm64
```
```
buck run --target-platforms ovr_config//platform/macos:arm64-fbsource -c pt.vulkan_full_precision=1 //xplat/caffe2:pt_vulkan_quantized_api_test_binAppleMac\#macosx-arm64
```
Differential Revision: D44392406
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97698
Approved by: https://github.com/SS-JIA