[CI] diff driven test selection (#8077)
TLDR: Analyze PR's diff and filter out tests that aren't exercising what
has changed, potentially cutting down runtime and expense by 95-99% most
of the time.
Detailed:
Deepspeed's CI takes forever - most of the time burning $$ and wastes
dev time for no reason as most changes require just a few tests to run.
HF Transformers has a system to select which tests to run based on the
diff of the PR - Sylvain Gugger wrote it many years ago since that repo
has now probably thousands of tests. Deepspeed's CI isn't too bad but
can easily take hours.
So I asked Claude Opus 4.8 to replicate the system for Deepspeed. Please
have a look. It looks super complicated, so I'm not sure how easy it'd
be to maintain/operate unless we always use AI to continue maintaining
it. I asked Claude to leave a detailed state file so that it or another
model could pick it up where it left.
I started with just the slowest costliest workload
`.github/workflows/modal-torch-latest.yml` to see if it works well. If
you're happy we can replicate it to the rest of the workloads.
CC: @loadams, @tjruwase - please tag others if you think they would be
helpful to discuss this.
---------
Signed-off-by: Stas Bekman <stas@stason.org>