Add more impl_nvfuser for prims (#78493)
This PR adds `test_nvfuser_impl_is_used` that checks that the corresponding nvfuser op (if available) is used in the prim definition.
Adds `impl_nvfuser=` for atan2, bitwise_and, bitwise_or, bitwise_xor, eq, ne, pow, sub, sum, where, rsqrt, lgamma.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78493
Approved by: https://github.com/mruberry