Use weights_only for torch.load checkpoints (#28097)
### Description
This PR updates PyTorch checkpoint loading in the T5 helper and NVIDIA
pretraining resume script to prefer `torch.load(...,
weights_only=True)`. Newer PyTorch versions recommend this safer load
mode for checkpoints because default pickle-based loading can execute
arbitrary code when a `.pt` file is malicious.
Behavior changes across PyTorch versions:
* PyTorch 1.10 ~ 2.5: weights_only exists, but default = False
* PyTorch 2.6+: Default changed to True (security-driven change)
### Summary of Changes
| File | Change |
|------|--------|
| `onnxruntime/python/tools/transformers/models/t5/t5_helper.py` | Adds
a local helper that loads state dict checkpoints with
`weights_only=True` when available and uses it for `state_dict_path`. |
| `orttraining/tools/scripts/nv_run_pretraining.py` | Adds the same
compatibility helper and uses it when resuming from `ckpt_*.pt` training
checkpoints. |
### Motivation and Context
`torch.load` can deserialize Python pickle payloads. Using
`weights_only=True` narrows loading to tensor/checkpoint data on
supported PyTorch versions and is the safer default for model weights.
This reduces risk if an attacker can place or substitute a local `.pt`
file, or persuades an operator to download and resume from a malicious
checkpoint.
### Testing
- `python -m py_compile
onnxruntime/python/tools/transformers/models/t5/t5_helper.py
orttraining/tools/scripts/nv_run_pretraining.py`