onnxruntime
c8122bf1
- Update GQA type inference
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
111 days ago
Update GQA type inference
References
derdeljan/hybrid_fp16_gqa
Author
derdeljan-msft
Parents
3e4d54aa
Loading