Cast Nodes Fusion (#24842)

Commit

281 days ago

Cast Nodes Fusion (#24842) ### Description  We might have a case where multiple Cast nodes in the chain cast back to the original type. This fusion will remove extra nodes. E.g. `A ('float32') -> Cast (to='float16') -> Cast (to='int4') -> Cast (to='float32') -> Cast (to='float16') -> B ` will reduce to ` A ('float32') -> Cast (to='float16') -> B ` All the Cast nodes throughout the path need to have one input and one output to be considered for the fusion. ### Motivation and Context  Gemma3 ONNX models used to have double casting, and many new models created by the model builder might have as well. Extra Casts might reduce accuracy and increase inference time.

References

#24842 - Cast Nodes Fusion

Author

nenad1002

Parents

340b188c

onnxruntime 24e0b07a - Cast Nodes Fusion (#24842)

onnxruntime
24e0b07a - Cast Nodes Fusion (#24842)