DeepSpeed
Checkpointing: Avoid assigning tensor storage with different device
#4836
Merged

Loading