DeepSpeed
c20f6fa4 - support baichuan model: (#4721)

Commit
2 years ago
support baichuan model: (#4721) * fix Baichuan meta data error * add BaichuanLayer and DecoderLayer to glmtype when prepare tp fused qkvw * add get_alibi_mask function for Baichuan to enable TP --------- Co-authored-by: Lai, Yejing <yejing.lai@intel.com> Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>
Author
Parents
Loading