allow contiguous inputs run into qcat_nhwc_stub when dim is last dimension (#72575)
Differential Revision: [D34113898](https://our.internmc.facebook.com/intern/diff/D34113898)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72575
Approved by: https://github.com/frank-wei