DeepSpeed
Long sequence parallelism (Ulysses) integration with HuggingFace
#5774
Merged

Loading