DeepSpeed
Checks for None tensors and skip them when splitting the buckets in zero stage 2.
#728
Merged

Loading