llvm-project
98aea5bf - [AMDGPU] Add infrastructure for machine-level inliner

Commit

176 days ago

[AMDGPU] Add infrastructure for machine-level inliner Add the necessary infrastructure for the machine-level inliner. The inliner will initially only handle calls to functions with the `amdgpu_gfx_whole_wave` calling convention. Partial inlining is currently not supported - all whole wave functions will be inlined into all their call sites and removed from the module (which should be safe since whole wave functions can't be called indirectly and their address can't be taken). As a consequence, recursive whole wave functions are not supported yet (I'll fix that in a separate patch). In addition to a MachineFunction pass representing the inliner itself, the patch adds a custom FPPassManager (`AMDGPUInliningPassManager`) which helps manage the inlining process. It does this by suspending the processing of inlined functions when the inliner runs, which means they will have the correct shape when the inliner runs on their callers. After the pass pipeline is run on all the functions in the module, the custom pass manager will finally release the inlined MachineFunctions (in the future, it's easy to update it to run the remainder of the pass pipeline on them instead of just deleting them, making it possible to support partial inlining, and with it recursion). This works because the backend passes already run inside a call graph pass manager, so the callees are always processed before the callers. The custom pass manager is inserted into the pipeline by another pass, `AMDGPUInliningAnchor`, whose `preparePassManager` method will oust any existing FunctionPass manager and replace it with the inlining pass manager. This makes it possible to use the custom pass manager without any other changes to the pass manager infrastructure. Support for the new pass manager will be part of a different patch.

References

#169477 - [AMDGPU] Update machine frame info during inlining

#169478 - [AMDGPU] Insert inliner anchor earlier

Author

rovka

Committer

rovka

Parents

df28fe28

llvm-project 98aea5bf - [AMDGPU] Add infrastructure for machine-level inliner

llvm-project
98aea5bf - [AMDGPU] Add infrastructure for machine-level inliner