llvm-project
a5e55e7e - [AMDGPU] Optimize fsub and fneg when packed fp32 ops are supported (#195962)

Commit
30 days ago
[AMDGPU] Optimize fsub and fneg when packed fp32 ops are supported (#195962) We should take advantage of v_pk_add_f32 to optimize fsub v2f32. In addition, for fneg in wider vectors, we should split to v2f32 to match the source modifier for fadd v2f32.
Author
Parents
Loading