llvm-project
d96cbf4a - [AMDGPU] Improve codegen for uniform f16<-->i32 conversions (#176833)

Commit
91 days ago
[AMDGPU] Improve codegen for uniform f16<-->i32 conversions (#176833) This patch improves codegen by chaining scalar operations for uniform f16<-->i32 conversions where hardware supports the specific SALU operations. Added patterns in SOPInstructions.td to synthesize f16<-->i32 conversions via intermediate f32 (f16-->f32-->i32 and i32-->f32-->f16).
Author
Parents
Loading