onnxruntime
Ability to fuse non-square (pruned) attention weights for BERT-like models
#6850
Merged

Loading