Add the policy to run llama model from the official repo (#4313)
* Add the llama2 support from the official llama repo
* add back commented function
* add new policy & implementation for llama2
* add some changes to inject/run the 70b llama model
* remove debugging code
* remove more debugging code
* formatting
* use num_kv only when it has positive value
* use the num_kv param only if it is positive
* fix syntax and format errors.
* fix an issue with the float32 transform kernel
---------
Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>