inductor: separate bias from PackeLinear for better performance (#93348)
For PakedLinear with has bias, we always copy bias to output before doing the computation:
https://github.com/pytorch/pytorch/blob/d7a3f2128fb4457dd60fd5d23e77d2c66a8b0f02/aten/src/ATen/native/mkldnn/Linear.cpp#L389-L397.
This PR separates bias from it which can make the bias add fused with the post-op.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93348
Approved by: https://github.com/jgong5, https://github.com/desertfire, https://github.com/jansel