[X86] combineVTRUNCSAT - attempt to recognise VTRUNCS/US(CONCAT(X,Y)) -> PACKSS/US(X,Y) folds. (#178707)
If we're just concatenating subvectors together to perform a saturated
truncate, see if we can perform PACK on the subvectors directly instead
- 256-bit PACK will require a post-shuffle, but this will typically fold
away in later shuffle combining and its probably better than changing
vector widths with concats.
Reference patch based off poor codegen identified in #169995