onnxruntime
60ce9ccd - Relax GQA seqlens_k shape validation for backward compat with older models (#28259)

Commit

64 days ago

Relax GQA seqlens_k shape validation for backward compat with older models (#28259) ### Problem PR #28031 fixed a security OOB GEMM bug via crafted seqlens_k by changing `&&` to `||` in the shape validation in group_query_attention_helper.h. This correctly enforces the spec (1D Tensor of shape (batch_size)) but breaks models (e.g. qwen3-0.6b, qwen3-1.7b) whose builder.py emits seqlens_k with shape [1,1] instead of [1]. ### Fix Relax the shape check to accept shapes with unit dimensions around the batch axis. The validation rule is: 1. **seqlens_k must be at least 1D** (scalars are rejected) 2. **Total element count must equal batch_size** 3. **Each dimension must be 1 or batch_size** (e.g. accepts [B], [B,1], [1,B] but rejects [2,2] for B=4) Also fixes the same latent &&/|| bug in the JS/WebGPU EP (group-query-attention.ts). **Security**: The per-element value bounds checks in Compute() are unchanged -- the OOB fix from #28031 is fully preserved. ### Changes - group_query_attention_helper.h -- scalar rejection + element-count shape check (shared by CPU, CUDA, WebGPU EPs) - group-query-attention.ts -- same fix for the JS WebGPU path - group_query_attention_op_test.cc -- tests for [1,1] compat, multi-batch [2,1] compat, trailing-batch [1,2] compat, scalar rejection, wrong-count rejection, and invalid factored shape rejection --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>

References

#28259 - Relax GQA seqlens_k shape validation for backward compat with older models

Author

vraspar

Parents

d02a0fd5

onnxruntime 60ce9ccd - Relax GQA seqlens_k shape validation for backward compat with older models (#28259)

onnxruntime
60ce9ccd - Relax GQA seqlens_k shape validation for backward compat with older models (#28259)