onnxruntime
156368b6
- Quantize attention with Cuda (#3693)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
6 years ago
Quantize attention with Cuda (#3693) * Add definition of QAttention * implemention of QAttention on GPU
References
#3693 - Quantize attention with Cuda
Author
yufenglee
Parents
49f06104
Loading