Deduplicate codegenOutputQuery to query maximum CUDA compute capabilities (#55901)
Summary:
There were 2 versions of the same code which were slightly different although functionally equivalent.
When adding support for another CUDA / device version both would need to be changed and kept in sync. So it is better to have only 1 version of it as the unique source of truth.
I chose the implementation which looks cleaner and easier to read and added some minor enhancements and comments to further increase readability.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55901
Reviewed By: H-Huang
Differential Revision: D31636917
Pulled By: bertmaher
fbshipit-source-id: 622e1fabc39de4f3f1b1aa9a1544cfbd35a5cfd9