onnxruntime
b77dbd43 - Optimize layout for SubgroupMatrixLoad on Intel (#25384)

Commit
154 days ago
Optimize layout for SubgroupMatrixLoad on Intel (#25384) This introduces a new LayoutProgram to pre-process the input matrix A, converting it to a layout that is more efficient for the SubgroupMatrixLoad operation on Intel GPUs.
Author
Parents
Loading