[ROCm] Enable MXFP4 MoE weight pre-shuffling on gfx950 and update aiter (#34192)
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com>
Co-authored-by: tjtanaavllm <tunjian.tan@amd.com>