Fix llama meta tensor loading in AutoTP and kernel injected inference (#3608)
* Adapt to Llama when using meta tensor to load
* Fix gated mlp parameter mp
* Re-enable meta tensor for kernel injection
Fix layer params loading in meta tensor
* Revert mlp_inter_mp for gated mlp as it is fixed
* Monkey patch for fixing llama output
* Fix formatting
* Add comment
---------
Co-authored-by: Lev Kurilenko <lekurile@microsoft.com>
Co-authored-by: Lev Kurilenko <113481193+lekurile@users.noreply.github.com>