inductor: always convert weight to channels_last for cpu conv (#105517)
For the CPU backend, we always use channels_last to get good performance by avoiding format reorder(block to plain or plain to black), and they also assume that the weight is channels_last when doing the weight packing, so there always convert weight format and doing layout optimization.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105517
Approved by: https://github.com/jgong5, https://github.com/shunting314