DeepSpeed
da771ed4 - Add MLP/lm_head tp grain size setting. (#6828)

Comment changes are shownComment changes are hidden
Commit
183 days ago
Add MLP/lm_head tp grain size setting. (#6828) This PR aims to add MLP/lm_head tp size granularity setting to deepspeed.init_inference() API. It will be more flexible to set the MLP/lm_head sharding grain size. DNN library favors tensor size in granularity of power of 2, we pick 64 as a default size. We aim to be able to set the MLP/lm_head tp grain size flexibly. This is a preliminary solution. If there is a better solution, we can discuss it together. Thanks~ --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Author
Parents
  • deepspeed
    • inference
      • File
        config.py
    • module_inject
      • File
        replace_module.py
      • File
        tp_shard.py