onnxruntime
197da135
- Implement quantized Attention on cpu (#4111)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
5 years ago
Implement quantized Attention on cpu (#4111) * Implement QAttention on CPU * support QAttention in quantization tool * refine attention code * add more unit tests
References
#4111 - Implement quantized Attention on cpu
Author
yufenglee
Parents
62b44527
Loading