[DAGCombiner] Fix subvector extraction index for big-endian STLF (#180795)
This PR fixes a big-endian regression in `ForwardStoreValueToDirectLoad`
where the wrong subvector was being extracted. In big-endian, memory
offset 0 corresponds to the high bits, so the extraction index needs to
be adjusted.
As suggested by @KennethHilmersson, calculate the extraction index as
the difference between the number of elements in the intermediate vector
and the load vector when in big-endian mode.
Special thanks to Kenneth Hilmersson for providing the fix logic and the
ARM regression test.
https://github.com/llvm/llvm-project/pull/172523#issuecomment-3878065191
https://github.com/llvm/llvm-project/pull/172523#issuecomment-3879575092