onnxruntime
54fdb640 - Address performance regression with duplicate initializers across DML partitions (#16087)

Commit
2 years ago
Address performance regression with duplicate initializers across DML partitions (#16087) This addresses a DML performance regression introduced by the constant sharing pass. The constant sharing pass identifies small initializer tensors which contain identical values and merges them. This could have the effect of causing DML to treat those tensors as non-constant and skip certain optimization. To prevent this, there is now an element count threshold below which the DML EP will enable this optimization, even though it results in duplicate work uploading and pre-processing the common tensor at multiple operators.
Author
Parents
Loading