onnxruntime
6238e9c0 - Add LLaMA end-to-end benchmarking (#19985)

Commit

1 year ago

Add LLaMA end-to-end benchmarking (#19985) ### Description This PR adds a benchmarking script to measure end-to-end performance and saves the results in a CSV file. ### Motivation and Context With this PR, end-to-end performance can be easily measured for many large-language models such as LLaMA-2. The performance numbers for LLaMA-2 are located [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/python/models/llama).

References

#19985 - Add LLaMA end-to-end benchmarking

Author

kunal-vaishnavi

Parents

eab35c20

onnxruntime 6238e9c0 - Add LLaMA end-to-end benchmarking (#19985)

onnxruntime
6238e9c0 - Add LLaMA end-to-end benchmarking (#19985)