onnxruntime
54fdb640 - Address performance regression with duplicate initializers across DML partitions (#16087)

Commit

2 years ago

Address performance regression with duplicate initializers across DML partitions (#16087) This addresses a DML performance regression introduced by the constant sharing pass. The constant sharing pass identifies small initializer tensors which contain identical values and merges them. This could have the effect of causing DML to treat those tensors as non-constant and skip certain optimization. To prevent this, there is now an element count threshold below which the DML EP will enable this optimization, even though it results in duplicate work uploading and pre-processing the common tensor at multiple operators.

References

#16087 - Address performance regression with duplicate initializers across DML partitions

Author

jeffbloo

Parents

a5410515

onnxruntime 54fdb640 - Address performance regression with duplicate initializers across DML partitions (#16087)

onnxruntime
54fdb640 - Address performance regression with duplicate initializers across DML partitions (#16087)