onnxruntime
e48dc3b2
- Parallelize Transpose (#16854)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
Parallelize Transpose (#16854) It gives up to 5.6% improvement for prompt and 2.3% improvement for token generation in LLaMA 7B case.
References
#16854 - Parallelize Transpose
Author
yihonglyu
Parents
3c10f027
Loading