DeepSpeed
b288cf1b - Enable contiguous gradients with Z1+MoE (#2250)

Commit
3 years ago
Enable contiguous gradients with Z1+MoE (#2250) MoE training with zero stage 1 only works with `contiguous gradients=True`.
Author
Parents
Loading