improve roll performance (#33623)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/33544
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33623
Differential Revision: D20037643
Pulled By: ngimel
fbshipit-source-id: 9fd293eca5242daf414c116344b2e1fde9f9ebc5