Update NCCL submodule to v2.20.5 (#121635)
Updates NCCL submodule to 2.20.5 . Includes a lot of bugfixes for reductions and connections issues. Should also improve performance. We have been running 2.20.5 internally for a few weeks, the binary pip wheels have finally been published so we can update main.
Release notes here: https://docs.nvidia.com/deeplearning/nccl/release-notes/rel_2-20-5.html#rel_2-20-5
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121635
Approved by: https://github.com/malfet