[Misc] Add EPLB debug logging for balance diagnostics
Add comprehensive debug logging to the EPLB system to help diagnose
expert load balancing issues in wideEP deployments:
- Per-step balance breakdown: worst/best layer indices, min/max rank
token counts for the worst layer, replica distribution stats
- Pre-rearrange diagnostics: window utilization, load distribution
across logical experts, top-5 hottest experts
- Post-rearrange diagnostics: number of changed slots, replica count
stats, predicted post-rearrange balancedness (simulates expected
balance with the new mapping applied to current load data)
- Warning when window_size > step_interval (stale data risk)
Signed-off-by: Travis Shears <travis@neuralmagic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>