Initial f32, bf16 TPU SublanesShuffleOp
Add a TPU SublaneShuffleOp, which allows us to swap sublanes across vregs. The contract is sublane_shuffle(lhs, rhs, pattern)->out where pattern encompasses the entire set of sublanes desired position to be pulled from either lhs or rhs.
PiperOrigin-RevId: 754972962