onnxruntime
a2b0a69d - Update MultiHeadAttention benchmark to test CPU (#20972)

Commit
1 year ago
Update MultiHeadAttention benchmark to test CPU (#20972) ### Description MultiHeadAttention benchmark script only supports cuda provider right now. This updates the script to support testing cpu operator and ploting gpu latency. ### Motivation and Context Benchmark for the coming cpu flash attention.
Author
Parents
Loading