[mlir] Add FP software implementation lowering pass: `arith-to-apfloat` (#166618)
This commit adds a new pass that lowers floating-point `arith`
operations to calls into the execution engine runtime library. Currently
supported operations: `addf`, `subf`, `mulf`, `divf`, `remf`.
All floating-point types that have an APFloat semantics are supported.
This includes low-precision floating-point types such as `f4E2M1FN` that
cannot execute natively on CPUs.
This commit also improves the `vector.print` lowering pattern to call
into the runtime library for floating-point types that are not supported
by LLVM. This is necessary to write a meaningful integration test.
The way it works is
```mlir
func.func @full_example() {
%a = arith.constant 1.4 : f8E4M3FN
%b = func.call @foo() : () -> (f8E4M3FN)
%c = arith.addf %a, %b : f8E4M3FN
vector.print %c : f8E4M3FN
return
}
```
gets transformed to
```mlir
func.func private @__mlir_apfloat_add(i32, i64, i64) -> i6
func.func @full_example() {
%cst = arith.constant 1.375000e+00 : f8E4M3FN
%0 = call @foo() : () -> f8E4M3FN
// bitcast operand A to integer of equal width
%1 = arith.bitcast %cst : f8E4M3FN to i8
// zext A to i64
%2 = arith.extui %1 : i8 to i64
// same for operand B
%3 = arith.bitcast %0 : f8E4M3FN to i8
%4 = arith.extui %3 : i8 to i64
// get the llvm::fltSemantics(f8E4M3FN) as an enum
%c10_i32 = arith.constant 10 : i32
// call the impl against APFloat in mlir_apfloat_wrappers
%5 = call @__mlir_apfloat_add(%c10_i32, %2, %4) : (i32, i64, i64) -> i64
// "cast" back to the original fp type
%6 = arith.trunci %5 : i64 to i8
%7 = arith.bitcast %6 : i8 to f8E4M3FN
vector.print %7 : f8E4M3FN
}
```
Note, `llvm::fltSemantics(f8E4M3FN)` is emitted by the pattern each time
an `arith` op is transformed, thereby making the call to
`__mlir_apfloat_add` correct (i.e., no name mangling on type necessary).
RFC:
https://discourse.llvm.org/t/rfc-software-implementation-for-unsupported-fp-types-in-convert-arith-to-llvm/88785
---------
Co-authored-by: Matthias Springer <me@m-sp.org>