[Clang][bytecode] Add interp__builtin_elementwise_triop_fp to handle general 3-operand floating point intrinsics (#157106)
Refactor interp__builtin_elementwise_fma into something similar to interp__builtin_elementwise_triop with a callback function argument to allow reuse with other intrinsics.
This will allow reuse with some upcoming x86 intrinsics