[flang] Enable loop-versioning for slices. (#120344)
Loops resulting from array expressions like array(:,i)
may be versioned for the unit stride of the innermost dimension,
when the initial array is an assumed-shape array (which are contiguous
in many Fortran programs).
This speeds up facerec for about 12% due to further vectorization
of the innermost loop produced for the total SUM reduction.