Fix pr-status.js: retry on API errors and report timed-out jobs (#92205)
### What?
Fixes two bugs in `scripts/pr-status.js` that caused the script to
silently report zero failures when there were actual CI failures.
### Why?
When running `node scripts/pr-status.js 92080`, the script reported "No
failed jobs found" despite the PR having real failures. Two root causes:
1. **Transient API errors silently swallowed.** `getAllJobs()` had a
bare `catch { break }` that returned an empty array on any GitHub API
error (e.g., HTTP 502). Since the GitHub Actions jobs API for large runs
frequently returns transient 502s, this caused false "no failures"
reports.
2. **`timed_out` and `startup_failure` jobs were invisible.** The script
only checked for `conclusion === 'failure'`, but GitHub uses distinct
conclusion values like `timed_out` (job exceeded timeout) and
`startup_failure` (runner failed to start). These jobs fell through all
filters and were silently omitted from reports.
### How?
**Retry logic in `getAllJobs()`:**
- Retries each paginated API call up to 3 times with 2s/4s backoff
- If all retries fail on the first page, throws an error (no silent
empty results)
- If later pages fail after partial data is collected, warns and returns
what was fetched
**Broader failure detection with `FAILED_CONCLUSIONS`:**
- Added a shared `FAILED_CONCLUSIONS` set: `{'failure', 'timed_out',
'startup_failure'}`
- Used in `getFailedJobs()`, `categorizeJobs()`, and flaky test
detection
- Jobs with non-`failure` conclusions are annotated in the report table
(e.g., "(timed_out)")
- `getFailedJobs()` now also returns the `conclusion` field so
downstream code can distinguish failure types
Co-authored-by: Tobias Koppers <sokra@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>