[AMDGPU] Add initial cost function framework for balanced scheduling
Introduce an initial cost function into the AMDGPU instruction scheduler
as the foundation for a more balanced scheduling framework. The goal is
to move beyond occupancy-as-a-hard-target by providing a configurable
mechanism to evaluate trade-offs between different candidate schedules.
Key features:
Schedule length term weighted by block frequency.
Weighted occupancy cost with concave penalty that rewards lower initial occupancy gains more than later ones.
Large additive penalty to strongly discourage schedules that increase spilling.
Configurable weights and knobs to support tuning.