onnxruntime
aadf5da1 - Add asymmetric head_size regression test for MEA fallback

Commit

1 day ago

Add asymmetric head_size regression test for MEA fallback Add TestONNXAttentionMHAAsymmetricHeadSize to verify that MEA gracefully falls back to unfused attention when head_size != v_head_size with past_key present (decode phase). Without the eligibility guard in ComputeInternal, this would crash with ORT_ENFORCE in LaunchConcatNewToPastKV. To support this test, add v_head_size field to AttentionConfig (defaults to 0 = same as head_size) and propagate it through the ONNX graph builder and io_binding helpers for V-related shapes (V input, past_value, present_value, Y output). All existing tests are unaffected since they don't set v_head_size. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Agent-signed-off: Developer (b0ebe545) [claude-opus-4.6] Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

References

#27851 - Add Memory Efficient Attention decode support and tests for ONNX Attention

Author

titaiwangms

Parents

c2da4b12

onnxruntime aadf5da1 - Add asymmetric head_size regression test for MEA fallback

onnxruntime
aadf5da1 - Add asymmetric head_size regression test for MEA fallback