onnxruntime
e48dc3b2 - Parallelize Transpose (#16854)

Commit
2 years ago
Parallelize Transpose (#16854) It gives up to 5.6% improvement for prompt and 2.3% improvement for token generation in LLaMA 7B case.
Author
Parents
Loading