onnxruntime
24e0b07a - Cast Nodes Fusion (#24842)

Commit
207 days ago
Cast Nodes Fusion (#24842) ### Description <!-- Describe your changes. --> We might have a case where multiple Cast nodes in the chain cast back to the original type. This fusion will remove extra nodes. E.g. `A ('float32') -> Cast (to='float16') -> Cast (to='int4') -> Cast (to='float32') -> Cast (to='float16') -> B ` will reduce to ` A ('float32') -> Cast (to='float16') -> B ` All the Cast nodes throughout the path need to have one input and one output to be considered for the fusion. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Gemma3 ONNX models used to have double casting, and many new models created by the model builder might have as well. Extra Casts might reduce accuracy and increase inference time.
Author
Parents
Loading