[AArch64] Implement FP8 SVE Intrinsics for narrowing conversions (#118124)
This patch adds the following instrinsics:
* Half-precision and BFloat16 convert, narrow, and interleave to 8-bit
floating-point.
// Variant is also available for: _bf16_x2
svmfloat8_t svcvtn_mf8[_f16_x2]_fpm(svfloat16x2_t zn, fpm_t fpm);
* Single-precision convert, narrow, and interleave to 8-bit
floating-point (top and bottom).
svmfloat8_t svcvtnt_mf8[_f32_x2]_fpm(svmfloat8_t zd, svfloat32x2_t zn,
fpm_t fpm);
svmfloat8_t svcvtnb_mf8[_f32_x2]_fpm(svfloat32x2_t zn, fpm_t fpm);