optimize channels last for BatchNorm2d on CPU (#59286)
Summary:
replacement of https://github.com/pytorch/pytorch/issues/48919
optimize channels last performance for BatchNorm2 on CPU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59286
Reviewed By: bdhirsh
Differential Revision: D29008198
Pulled By: VitalyFedyunin
fbshipit-source-id: 8a7d020bd6a42ab5c21ffe788b79a22f4ec82ac0