[mlir][Vector] Improve vector.transferx store-to-load-forwarding (#171840)
This patch changes the transfer_write -> transfer_read load store
forwarding canonicalization pattern to work based on permutation maps
and less on adhoc logic. The old logic couldn't canonicalize a simple
unit dim broadcast through transfer_write/transfer_read which is added
as a test in this patch.
This patch also details what would be needed to support cases which are
not yet implemented better.