DeepSpeed
da771ed4 - Add MLP/lm_head tp grain size setting. (#6828)

Commit

282 days ago

Add MLP/lm_head tp grain size setting. (#6828) This PR aims to add MLP/lm_head tp size granularity setting to deepspeed.init_inference() API. It will be more flexible to set the MLP/lm_head sharding grain size. DNN library favors tensor size in granularity of power of 2, we pick 64 as a default size. We aim to be able to set the MLP/lm_head tp grain size flexibly. This is a preliminary solution. If there is a better solution, we can discuss it together. Thanks~ --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

References

#6828 - Add MLP/lm_head tp grain size setting.

Author

Yejing-Lai

Parents

87c65068

DeepSpeed da771ed4 - Add MLP/lm_head tp grain size setting. (#6828)

DeepSpeed
da771ed4 - Add MLP/lm_head tp grain size setting. (#6828)