Mistral Optimization & Benchmarking Support (#18225)
### Description
As a prerequisite for this model running correctly, two PRs need to be
merged:
- GQA Sliding Window Attention:
https://github.com/microsoft/onnxruntime/tree/aciddelgado/gqa_local
- MHA Fusion:
https://github.com/frankdongms/onnxruntime/tree/frdong/llama_70b
This PR adds optimization, quantization, and benchmarking support for
Mistral. The README included describes steps to export, optimize, and
benchmark Mistral models, but won't function correctly without the two
above branches being merged first.
---------
Co-authored-by: Peter McAughan <petermca@microsoft.com>
Co-authored-by: Abhishek Jindal <abjindal@microsoft.com>
Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>