To vectorize long datatype as mask index (#91076)
In this PR, we record the current fx node being executed to cache additional information to simply the vectorization checker. In addition, we supported `masked` in this PR by simplifying it as `mask_load` to support `max_pool2d`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91076
Approved by: https://github.com/jgong5, https://github.com/desertfire, https://github.com/jansel