[X86] remove unnecessary movs when %rdx is an input to mulx (#184462)
Closes: https://github.com/llvm/llvm-project/issues/174912
When generating a `mulx` instruction for a widening multiplication, even
if one input is placed in %rdx, LLVM won't place it in the implicit
first slot, instead it'll generate two movs before calling mulx to swap
the registers, which are unnecessary. GCC already has this optimization
(as shown in the issue) so this puts the two compilers closer to each
other on that front.
Co-authored-by: Aiden Grossman <aidengrossman@google.com>