[RISCV] Update SiFive7's scheduling models with their optimizations on permutation instructions (#160763)
In newer SiFIve7 cores like X390, permutation instructions like
vrgather.vv operates on LMUL smaller than a single DLEN could yield a
constant cycle. For slightly larger data that fits in the constraint of
`log2(SEW/8) + log2(LMUL) <= log2(DLEN / 32)`, these instructions can
also yield cycles that are proportional to the quadratic of LMUL, rather
than being proportional to VL.
Co-authored-by: Michael Maitland <michaeltmaitland@gmail.com>