feat(ci): detect tests failing across multiple PRs
- Scan other open PRs for similar test failures
- Show cross-PR alert when failures appear in multiple PRs
- Flag likely infrastructure issues when 3+ PRs have same failure
- Helps distinguish between flaky infra and real test regressions