onnxruntime
Refactoring of attention cuda kernel: move prepare qkv and concat_past_to_present
#17559
Merged

Commits
  • move prepare qkv and present
    tianleiwu committed 2 years ago
  • update include
    tianleiwu committed 2 years ago
  • fix hipify
    tianleiwu committed 2 years ago
Loading