Memoize repeated nonzero calls to the same fake tensor (#95399)
This removes the need to explicitly constrain_unify `x[mask]` and `y[mask]` when mask is a boolean tensor. It's very narrow but it seems to work in practice.
To invalidate the nonzero call when mutation occurs, I use version counter. I know there are ways to bypass this but I think it's good enough for now.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95399
Approved by: https://github.com/eellison