onnxruntime
27cbebe3 - GatherBlockQuantized shape inference test (#25769)

Commit
131 days ago
GatherBlockQuantized shape inference test (#25769) ### Description Fix GatherBlockQuantized shape inference test ### Motivation and Context In GatherBlockQuantized op contrib_defs, we have shape inference test ``` for (int i = 0; i < r; ++i) { if (!data_shape.dim(i).has_dim_value() || !scales_shape.dim(i).has_dim_value() || (i == quantize_axis && (data_shape.dim(i).dim_value() * components + block_size - 1) / block_size != scales_shape.dim(i).dim_value()) || (i != quantize_axis && data_shape.dim(i).dim_value() != scales_shape.dim(i).dim_value())) { fail_shape_inference("data shape and scales shape do not match"); } } ``` This code is introduced last year. However, when I try to share weight for the phi-4-mini-instruct model <img width="233" height="494" alt="image" src="https://github.com/user-attachments/assets/9c220543-0b81-4867-bcd1-1b7aa49e20cd" /> I need to have a reshape operator into GatherBlockQuantized. The shape inference of Reshape is not from the initializer directly, but from the Concat which need to do some constant folding. Therefore, at the first sweep of shape inference, `data_shape.dim(i).has_dim_value()` is `False`, which will fail shape inference and the model cannot work. Therefore, When we want to check shape inference, we need to only check when `data_shape.dim(i).has_dim_value()=True`, same for `scales_shape`.
Author
Parents
Loading