[ROCM] Add build ROCM support to build-triton-wheel.yml (#95142)
To match with upstream and build triton whl's locally so nightly pytorch whls can access them without needing to use pypi.org.
We may have a better approach to build both whl's at once, but for now, to save duplication of code, another matrix is added for device (cuda/rocm) With rocm invoking a different commit and repo. The goal is to eventually have a single whl support both backends.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95142
Approved by: https://github.com/malfet, https://github.com/jithunnair-amd, https://github.com/atalman