onnxruntime
94c69f55
- GQA 4 CPU (#20299)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
GQA 4 CPU (#20299) ### Description Support GQA operator on CPU with FP32. ### Motivation and Context Right now, models generated for CPU and GPU must be different. GQA CPU allows these models to be the same.
References
#20299 - GQA 4 CPU
Author
aciddelgado
Parents
c47a6ce7
Loading