onnxruntime
156368b6 - Quantize attention with Cuda (#3693)

Commit
6 years ago
Quantize attention with Cuda (#3693) * Add definition of QAttention * implemention of QAttention on GPU
Author
Parents
Loading