[NNC] Fix some bugs in Round+Mod simplification (#42934)
Summary:
When working on the Cuda Codegen, I found that running the IRSimplifier before generating code lead to test fails. This was due to a bug in Round+Mod simplification (e.g. (x / y * y) + (x % y) => x) to do with the order in which the terms appeared. After fixing it and writing a few tests around those cases, I found another bug in simplification of the same pattern and have fixed it (with some more test coverage).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42934
Reviewed By: zhangguanheng66
Differential Revision: D23085548
Pulled By: nickgg
fbshipit-source-id: e780967dcaa7a5fda9f6d7d19a6b7e7b4e94374b