onnxruntime
Add Memory Efficient Attention decode support and tests for ONNX Attention
#27851

Open

Add Memory Efficient Attention decode support and tests for ONNX Attention #27851

titaiwangms wants to merge 8 commits into main from feature/mea-decode-support

Add MEA+decode test cases for ONNX Attention LLM op

719964ae

Add MEA+decode support in ONNX Attention LLM op

6ab008b7

Fix MEA eligibility: skip decode when head_size != v_head_size

f5befa80

Fix review findings: use v_head_size for V ops, add safety comment

53bddd27

Fix test review findings for MEA+decode tests

7318e243

Zero present buffers before concat to prevent NaN propagation

c2da4b12

Add asymmetric head_size regression test for MEA fallback

aadf5da1

Fix FURB110 lint: use `or` instead of ternary for v_head_size

0bdde29d

titaiwangms changed the title ~~Add Memory Efficient Attention decode support and tests for ONNX~~ Add Memory Efficient Attention decode support and tests for ONNX Attention 11 days ago

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

onnxruntime Add Memory Efficient Attention decode support and tests for ONNX Attention #27851 Open

Add Memory Efficient Attention decode support and tests for ONNX Attention #27851

onnxruntime
Add Memory Efficient Attention decode support and tests for ONNX Attention
#27851

Open