[Pytorch][Vulkan] Templatize BinaryOps (#105380)
Summary:
Use templates to generate the kernels for add, sub, mul, div and their variants (tensor/scalar, in-place/not in-place).
Rename Arithmetic.cpp to BinaryOp.cpp
Test Plan:
https://www.internalfb.com/phabricator/paste/view/P785131030
```
buck run --target-platforms ovr_config//platform/macos:arm64-fbsource //xplat/caffe2:pt_vulkan_api_test_binAppleMac\#macosx-arm64 -c pt.vulkan_full_precision=1
...
xplat/caffe2/aten/src/ATen/test/vulkan_api_test.cpp:6377: Skipped
QueryPool is not available
[ SKIPPED ] VulkanAPITest.querypool_flushed_shader_log (0 ms)
[----------] 307 tests from VulkanAPITest (5427 ms total)
[----------] Global test environment tear-down
[==========] 307 tests from 1 test suite ran. (5427 ms total)
[ PASSED ] 306 tests.
[ SKIPPED ] 1 test, listed below:
[ SKIPPED ] VulkanAPITest.querypool_flushed_shader_log
YOU HAVE 5 DISABLED TESTS
```
Differential Revision: D47307169
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105380
Approved by: https://github.com/SS-JIA