[Draft][CUDA] Use runtime driver API for cuStreamWriteValue32 (#156097)
Fixes #154073
Reference: https://github.com/NVIDIA/Fuser/pull/4197
See PR #154097
@nWEIdia is currently out of the office, so I’ve temporarily taken over his work.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156097
Approved by: https://github.com/ngimel, https://github.com/cyyever
Co-authored-by: Wei Wang <weiwan@nvidia.com>