[NCCL][AVOID_RECORD_STREAMS] Initialize `stashed_for_allocator_safety_` in `endCoalescing` if `TORCH_NCCL_AVOID_RECORD_STREAMS=1` (#106166)
Currently `stashed_for_allocator_safety_` is uninitialized in this path, which will crash if another operation assumes a non-nullptr (the case when `TORCH_NCCL_AVOID_RECORD_STREAMS=1` and `avoidRecordStreams_` is set).
CC @kwen2501 @ptrblck
@kwen2501
I'm not familiar with what happens to the coalesced work when `endCoalescing` is called. In theory, if the coalesced work has already "stashed for allocator safety," can we also avoid the record streams calls here? Or is the coalesced work discarded (and their `_stashed_for_allocator_safety` vectors also destroyed?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106166
Approved by: https://github.com/kwen2501