onnxruntime
Change head_size parameter dependent on qkv_hidden_size
#12933
Merged

Commits
  • Change head_size parameter dependent on qkv_hidden_size
    Peter Mcaughan committed 3 years ago
  • Remove bugged code, add CUDA attention UT
    Peter Mcaughan committed 3 years ago
  • Introduce head_sizes and hidden_sizes for qkv, insert where relevant
    Peter Mcaughan committed 3 years ago
  • Merge with main
    Peter Mcaughan committed 3 years ago
  • Fix linter warnings
    Peter Mcaughan committed 3 years ago
  • UT passing, saving work. Needs clean
    Peter Mcaughan committed 3 years ago
  • Passing unit tests!
    Peter Mcaughan committed 3 years ago
  • Remove print statements
    Peter Mcaughan committed 3 years ago
  • Simplify UT
    Peter Mcaughan committed 3 years ago
  • Remove dump objects
    Peter Mcaughan committed 3 years ago
  • Resolve merge conflict
    Peter Mcaughan committed 3 years ago
  • Undo undesired change
    Peter Mcaughan committed 3 years ago
  • Undo undesired change
    Peter Mcaughan committed 3 years ago
  • Address comments
    Peter Mcaughan committed 3 years ago
  • Fix linter warnings & comments
    Peter Mcaughan committed 3 years ago
  • Fix variable names
    Peter Mcaughan committed 3 years ago
  • Add fp16 checks for qkv_head_size and remove unused variables
    Peter Mcaughan committed 3 years ago
  • Fix longformer test failures
    Peter Mcaughan committed 3 years ago
  • Avoid future silent errors
    Peter Mcaughan committed 3 years ago
  • Avoid ROCM execution
    Peter Mcaughan committed 3 years ago
  • Add support for disabling ROCM in AttentionTest
    Peter Mcaughan committed 3 years ago
  • Add comments to clarify feature enablement in nonquantized CUDA only
    Peter Mcaughan committed 3 years ago
Loading