DeepSpeed
Handle actvitation checkpointing args that are None or non-tensors
#660
Merged

Loading