vulkan: add fwht support for Intel with shmem reduction (#23964)
* vulkan: add fwht support for Intel with shmem reduction
* don't use N as workgroup size
* disable subgroup shuffle on MoltenVK AMD
* disable fwht shader on Intel Windows due to driver bug