onnxruntime
f19bb3c7 - Fix SigLIP casual mask bug (#25360)

Commit
254 days ago
Fix SigLIP casual mask bug (#25360) ### Description <!-- Describe your changes. --> SigLIP architecture inside the vision encoder should not use a causal mask on the attention. This change will fix Phi 4 MM accuracy issues we have seen. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Author
Parents
Loading