[Hexagon] Optimize sext + mul pattern to use vmpyh instruction (#190316)
This patch adds TableGen patterns to recognize and optimize the pattern:
(v2i32 (mul (sext v2i16), (sext v2i16)))
And transforms it to use the M2_vmpy2s_s0 instruction which generates
the efficient vmpyh (vector multiply halfwords) instruction.
The transform is guarded by `nsw` because `M2_vmpy2s_s0` performs a
saturating signed multiply (`vmpyh(...):sat`), so the replacement is
only semantics-preserving when signed overflow is undefined in the IR.
Currently, this pattern expands to:
r3:2 = vsxthw(r0) // Sign extend
r1:0 = vsxthw(r1) // Sign extend
r1 = mpyi(r3,r1) // Scalar multiply
r0 = mpyi(r2,r0) // Scalar multiply
With this patch, it generates:
r1:0 = vmpyh(r0,r1):sat // Single vector multiply
Co-authored-by: Santanu Das <quic_santdas@quicinc.com>