Bazel: Only run ATen codegen once (#70147)
Summary:
Due to a merge conflict, the new bazel cuda build does something
rather obnoxious. It runs ATen codegen with `--per-operator-headers`
enabled and extracts a subset of the generated files; then calls it
again without the flag to extract the CUDA files.
This PR instead calls the codegen once but keeps track of what is
CPU and what is CUDA in separate lists.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70147
Reviewed By: VitalyFedyunin
Differential Revision: D33413020
Pulled By: malfet
fbshipit-source-id: 4b502c38a209d1aa63d715e2336df6fc5aac2212