[HLSL][Matrix] Make matrix truncation respect default matrix memory layout (#184280)
Fixes #183127 and #184371
This PR makes the matrix truncation cast implementation use the new
matrix flattened index helper functions introduced by #182904 so that it
reads elements from the source matrix using the default matrix memory
layout instead of always assuming column-major order.
This PR also fixes a bug where matrix truncation truncated the wrong
elements.
Assisted-by: claude-opus-4.6