[windows] Always pass fp128 arguments indirectly (#128848)
LLVM currently expects `__float128` to be both passed and returned in
xmm registers on Windows. However, this disagrees with the Windows
x86-64 calling convention [1], which indicates values larger than 64
bits should be passed indirectly.
Update LLVM's default Windows calling convention to pass `fp128`
directly. Returning in xmm0 is unchanged since this seems like a
reasonable extrapolation of the ABI. With this patch, the calling
convention for `i128` and `f128` is the same.
GCC passes `__float128` indirectly, which this also matches. However, it
also returns indirectly, which is not done here. I intend to attempt a
GCC change to also return in `xmm0` rather than making that change here,
given the consistency with `i128`.
This corresponds to the frontend change in [2], see more details there.
[1]:
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170
[2]: https://github.com/llvm/llvm-project/pull/115052