[CI][ROCm] fix device visibility, again (#91813)
The previous PR #91137 was incomplete. Though it successfully queried for the number of available GPUs, it still resulted in test files sharing the same GPU. This PR lifts the maxtasksperchild=1 restriction so that Pool workers will always use the same GPU. This also adds a Note in run_test.py for future reference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91813
Approved by: https://github.com/kit1980, https://github.com/huydhn, https://github.com/malfet