DeepSpeed
ee4a9295 - Port `parallel_state.py` (`mpu`) from Megatron-Deepspeed

Commit
264 days ago
Port `parallel_state.py` (`mpu`) from Megatron-Deepspeed Since we need `mpu` for Ulysses outside of Meg-DS we need the mpu code, so this PR ports the code. It appears non-trivial to just trim this file to SP groups as DS calls into many other methods of this class if `mpu is not None`.
References
Author
Parents
Loading