Upgrade GIST memory compression nodes, kernels, optimizer rule, and cli (#6262)
* Add gist nodes, kernels, optimizer rule, and cli
* Add Gist CUDA kernels
* Added/updated gist compression cli to bert, gpt2, mnist
* Fix decode priority generator for large models
* Fix hardcoded decode priority generator, update gist training test
* Fix incomplete if/else sequence for CI build
* Added MSFP15 for gist compression type
* fix Msfp15 bug
* Resolved azure pipeline errors - unsupported ORT_RETURN macro format, cudastream argument
* Resolved hardcoded cudastream argument, Pack8 zero error
* Resolved PR comments - except gist tests
* Added TypeInference to Gist Nodes, To attribute to Gist Decoder, Updated Gist Test Cases
* Reverted error in merge commit
* Updated logger usage in Gist rule, Updated GistPackMSFP15 compressed tensor's explaination
* Converted onnxruntime::make_unique to std::make_unique based on PR 7502
Co-authored-by: Fanny Nina Paravecino <faninapa@microsoft.com>
Co-authored-by: Aayush Ankit <aayushankit@microsoft.com>
Co-authored-by: Aayush Ankit <Aayush-Ankit@users.noreply.github.com>
Co-authored-by: Fanny Nina Paravecino <fanny.nina@microsoft.com>