Propagate root domain mappings from rfactor to root domains (#1556)
* Propagate root domain mappings from rfactor to root domains in
ComputeAtRootDomainMap
The main purpose of ComputeAtRootDomainMap is to find unmappable domains
for comptueAt. This analyais is done by traversing a fusion in a
backward direction. Currently, the traversal only visits arithmetic
expressions, so information propagation is done from consumer tensors to
producer tensors. This propagation is also required from rfactor domains
to root domains. Previously it doesn't really matter as rfactor is
limited reduction domains, but that's not the case with view.
This change also means that ComputeAtRootDomain does not guarantee
one-to-one mappings. For example,
```
tv0: [I0, I1]
tv1 = view(tv0); // tv1: [I0*I1/N, N]
```
I.e., the view op is done first merging the two domains of `tv0` and
then splitting it by N. Note that both of the two rfactor axes of `tv1`
are now mapped with the two axes of `tv0`.
Because of this change, `ComputeAtRootDomainMap:mapBestEffort` and other
mapping functions between a producer and a consumer that is supposed to
return a one-to-one map can fail.
`ComputeAtRootDomainMap::getMappableDims` is fine as it just grabs any
domain that is mappable.
`ComputeAtRootDomainMap::mapConsumerToProducer` and
`ComputeAtRootDomainMap::mapProducerToConsumer` were used in
`TransformReplay::replayPasC` and `TransformReplay::replayCasP`, but
they don't really need to use `ComputeAtRootDomainMap` but just
`PairwiseRootDomainMap` is sufficient, so replaed the usages with the
pairwise variant.