onnxruntime
b478086b - Fuse attention node even in case of different Q,K hidden dimensions (#8106)

Commit

4 years ago

Fuse attention node even in case of different Q,K hidden dimensions (#8106) * changes to fuse attention node and create varied dimensions * added an option to optimizer to only do offline fusion * fixing a typo * merge with master * removing extra changes * added new unit test - test_attention_fusion_for_varied_qkv_dimensions() * Unit test succesfull for q,k,v paths with varied dimensions * adding test model for unit test case * optimizing attention tests * removing debugs * minor change * addressing comments * addressing comments * changed the new option to disable_onnxruntime * replacing asserts with debugs * make attn fusion backward compatible for head_size, hidden_size * preserving behavior for shape_modified_tensor * adding new option as the last parameter * cleaning up * line breaks and spaces * formatting according to python * making the changes to fuse attention node without user input * changes to fusion_attention.py updated * bringing the code up to python standard

References

#8106 - Fuse attention node even in case of different Q,K hidden dimensions

Author

viboga

Parents

4fd7efcf

onnxruntime b478086b - Fuse attention node even in case of different Q,K hidden dimensions (#8106)

onnxruntime
b478086b - Fuse attention node even in case of different Q,K hidden dimensions (#8106)