Fix xnnpack hardswish memory issue (#59577)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59577
Collapse all dimensions of tensor into batch and use channels as 1. Fixes the 1D over calculation case
Test Plan:
buck test fbandroid/mode/server fbandroid/mode/asan_ubsan fbsource//xplat/caffe2:pt_xnnpack_test
buck test fbsource//xplat/caffe2:pt_xnnpack_test
Reviewed By: kimishpatel
Differential Revision: D28942141
fbshipit-source-id: b36f820a900b6a2ed649d6b9bac79d3392d3537c