Commit
1 year ago
GQA 4 CPU (#20299) ### Description Support GQA operator on CPU with FP32. ### Motivation and Context Right now, models generated for CPU and GPU must be different. GQA CPU allows these models to be the same.
Author
Parents
Loading