[X86] fuse constant addition after sbb (#184541)
Resolves: https://github.com/llvm/llvm-project/issues/171676
Related: https://github.com/llvm/llvm-project/pull/185117 (AArch64 side)
The issue points out that `Fold ADD(ADC(Y,0,W),X) -> ADC(X,Y,W)` is
optimized and that SBB can be optimized similarly:
`Fold ADD(SBB(Y,0,W),C) -> SBB(Y,-C,W)`.
With the changes from this branch, a new clang will compile the example
code:
```c
#include <stdint.h>
uint64_t f(uint64_t a, uint64_t b) {
uint64_t x;
x += __builtin_add_overflow(a, b, &x);
return x + 10;
}
uint64_t g(uint64_t a, uint64_t b) {
uint64_t x;
x -= __builtin_sub_overflow(a, b, &x);
return x + 10;
}
```
And it's optimized for the sub case as well, instead of emitting a leaq
on x86, it folds it in:
```asm
f:
movq %rdi, %rax
addq %rsi, %rax
adcq $10, %rax
retq
g:
movq %rdi, %rax
subq %rsi, %rax
sbbq $-10, %rax
retq
```