pytorch
2240018c - Construct `c10::Half` from `float16_t` on ARMv8 (#120425)

Commit View On GitHub

Commit

212 days ago

Construct `c10::Half` from `float16_t` on ARMv8 (#120425) By hiding float32 constructors and exposing float16 ones. This allows compiler do implicit conversions as needed, and in safe cases optimize out unneeded upcasts to fp32, see example [below](https://godbolt.org/z/5TKnY4cos) ```cpp #include <arm_neon.h> #ifndef __ARM_FEATURE_FP16_SCALAR_ARITHMETIC #error Ieeee #endif float16_t sum1(float16_t x, float16_t y) { return x + y; } float16_t sum2(float16_t x, float16_t y) { return static_cast<float>(x) + static_cast<float>(y); } ``` both sum variants are compiled to scalar fp16 add, if build for the platform that supports fp16 arithmetic ``` sum1(half, half): // @sum1(half, half) fadd h0, h0, h1 ret sum2(half, half): // @sum2(half, half) fadd h0, h0, h1 ret ``` Fixes build error in some aarch64 configurations after #119483 which are defined as supporting FP16 but don't define _Float16. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120425 Approved by: https://github.com/mikekgfb, https://github.com/atalman, https://github.com/snadampal

Author

malfet

Committer

pytorchmergebot

Parents

3f6be769

pytorch 2240018c - Construct `c10::Half` from `float16_t` on ARMv8 (#120425)

Commit

pytorch
2240018c - Construct `c10::Half` from `float16_t` on ARMv8 (#120425)