Cherry pick sorting patch (#85620)
Fixes https://github.com/csarofeen/pytorch/issues/1947
Cherry-picked patch for torchbench issues where fusion segmenter asserts in nvfuser:
1. test the groups comes with the same order as they are merged.
2. Fix detection of un-mappable root domains:
ComputeAtRootDomainMap flags domains that should not be mapped due to
reductions. Previously, checking if a domain potentially causes an
invalid mapping is only done with one domain in each group of domains
that are found to be mappable so far. That's not actually sufficient as
the unmappable domain set is created just once with no root mapping
information. The fix is to check all consumer domains of a producer
tensor. A small other fix is also done to address a different problem
discovered after the first fix.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85620
Approved by: https://github.com/csarofeen, https://github.com/davidberard98