DeepSpeed
add lm_head and embed_out tensor parallel
#3962
Merged

Loading