Skip NVIDIA driver installation if it's already there (#85435)
Address flaky failures such as https://github.com/pytorch/pytorch/actions/runs/3099236524/jobs/5018444060 in which NVIDIA driver has already been installed. The installation will be skipped if the same driver has already been installed.
I also move NVIDIA driver installation before the installation of docker NVIDIA support to avoid any funny business with the latter interfering with the installation.
### Testing
* Run `.github/scripts/install_nvidia_utils_linux.sh` manually with an existing but different NVIDIA driver installed (515.65.01)
```
== Installing nvidia driver NVIDIA-Linux-x86_64-515.57.run ==
+ HAS_NVIDIA_DRIVER=0
++ command -v nvidia-smi
+ '[' -x /usr/bin/nvidia-smi ']'
++ nvidia-smi --query-gpu=driver_version --format=csv,noheader
+ INSTALLED_DRIVER_VERSION=515.65.01
+ '[' 515.65.01 '!=' 515.57 ']'
+ echo 'NVIDIA driver (515.65.01) has been installed, but we expect to have 515.57 instead. Continuing with NVIDIA driver installation'
NVIDIA driver (515.65.01) has been installed, but we expect to have 515.57 instead. Continuing with NVIDIA driver installation
+ '[' 0 -eq 0 ']'
+ sudo yum groupinstall -y 'Development Tools'
Loaded plugins: dkms-build-requires, extras_suggestions, langpacks, priorities, update-motd
Maybe run: yum groups mark install (see man yum)
No packages in any requested group available to install or update
++ uname -r
+ sudo yum install -y 'kernel-devel-uname-r == 4.14.290-217.505.amzn2.x86_64'
Loaded plugins: dkms-build-requires, extras_suggestions, langpacks, priorities, update-motd
Package kernel-devel-4.14.290-217.505.amzn2.x86_64 already installed and latest version
Nothing to do
+ sudo modprobe backlight
+ sudo curl -fsL -o /tmp/nvidia_driver https://s3.amazonaws.com/ossci-linux/nvidia_driver/NVIDIA-Linux-x86_64-515.57.run
+ sudo /bin/bash /tmp/nvidia_driver -s --no-drm
...
```
* Run `.github/scripts/install_nvidia_utils_linux.sh` manually with the same NVIDIA driver installed (515.57)
```
== Installing nvidia driver NVIDIA-Linux-x86_64-515.57.run ==
+ HAS_NVIDIA_DRIVER=0
++ command -v nvidia-smi
+ '[' -x /usr/bin/nvidia-smi ']'
++ nvidia-smi --query-gpu=driver_version --format=csv,noheader
+ INSTALLED_DRIVER_VERSION=515.57
+ '[' 515.57 '!=' 515.57 ']'
+ HAS_NVIDIA_DRIVER=1
+ echo 'NVIDIA driver (515.57) has already been installed. Skipping NVIDIA driver installation'
NVIDIA driver (515.57) has already been installed. Skipping NVIDIA driver installation
+ '[' 1 -eq 0 ']'
+ nvidia-smi
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85435
Approved by: https://github.com/seemethere, https://github.com/malfet