onnxruntime
6e430c05 - A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis (#1578)

Commit

6 years ago

A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis (#1578) * A few performance improvements: - Make the iteration in NonZero more efficient by using a raw pointer and simplifying the increment logic - add another unit test to check the new logic works with 3 dimensional tensor - gains about 2% for ssd_mobilenet - Avoid floating point operations on each iteration on Concat - about 0.5% for ssd_mobilenet and ssd_resnet34 - Put common case first in ExecutionFrame::AllocateAsPerAllocationPlan to avoid unnecessary call to IsSparseTensor - about 0.05% for ssd_mobilenet - Minor tweak to put some ctors in the TensorShape header so they can be inlined more easily

References

#1578 - A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis

Author

skottmckay

Parents

a443b013

onnxruntime 6e430c05 - A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis (#1578)

onnxruntime
6e430c05 - A few performance improvements coming out of ssd_mobilenet and ssd_resnet34 analysis (#1578)