feat: use multi-runner on docker publish (#6556)
## Summary
This PR parallelizes multi-platform builds using multiple workers (hence
the new docker-build / docker-publish jobs), this seems to save about ~8
minutes.
This is partial work extracted from
https://github.com/astral-sh/uv/pull/6053 than is standalone