onnxruntime
bda012a4 - Scripts to convert model with MulitHeadAttention to packing mode (#16925)

Commit
2 years ago
Scripts to convert model with MulitHeadAttention to packing mode (#16925) ### Description Update scripts for converting model with MulitHeadAttention to packing mode. - [x] Update symbolic shape inference for PackedMultiHeadAttention and GatedRelativePositionBias - [x] Update convert_to_packing_mode to handle model with MulitHeadAttention ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
Author
Parents
Loading