inductor: using dummy input to pack the linear weight for bfloat16 dynamic shape path (#106122)
For the dynamic bfloat16 path, if we use plain weight, we can't call in amx path, so there use a dummy input(given a None value) to do the weight packing for better performance.
before:
```
onednn_verbose,exec,cpu,inner_product,x64:gemm:jit,forward_training,src_bf16::blocked:ab:f0 wei_bf16::blocked:ab:f0 bia_bf16::blocked:a:f0 dst_bf16::blocked:ab:f0,attr-scratchpad:user ,,mb64ic256oc256,9.4292
```
after:
```
onednn_verbose,exec,cpu,inner_product,brgemm:avx512_core_amx_bf16,forward_training,src_bf16::blocked:ab:f0 wei_bf16::blocked:AB16b32a2b:f0 bia_bf16::blocked:a:f0 dst_bf16::blocked:ab:f0,attr-scratchpad:user ,,mb64ic256oc256,0.35498
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106122
Approved by: https://github.com/jgong5, https://github.com/eellison