[MLIR] Add sincos fusion pass (#161413)
We see performance improvements from using sincos to reuse calculations
in hot loops that compute sin() and cos() of the same operand. Add a
pass to identify sin() and cos() calls in the same block with the same
operand and fast-math flags, and fuse them into a sincos op.
Follow-up to:
* #160561
* #160772