Test torch._refs with aten and nvfuser executors (#78926)
This PR adds testing of references with "aten" and "nvfuser" executors using `torch._prims.executor.make_traced`.
Many tests are skipped even for "aten" executor because of https://github.com/pytorch/pytorch/issues/78923.
I limited the dtypes for the nvfuser executor tests because it's slow due to compilation overhead (it took about 30 mins in total). With `float32` and `int32` types nvfuser tests take 5 minutes.
```
58 passed, 2507 skipped, 28162 deselected, 79 xfailed, 5 warnings in 297.58s (0:04:57)
```
58 tests passed means that 29 references work correctly with nvfuser executor now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78926
Approved by: https://github.com/mruberry