PaddleNLP
3d023bd1 - Support deepseek-v3/loraGA/ on xpu (#9928)

Commit
1 year ago
Support deepseek-v3/loraGA/ on xpu (#9928) * support deepseek-v3/loraGA/ on xpu * fix ds2 config * fix ds3 eval * fix seq_aux_loss, weight load time consume, moe tp * Optimize code of expert dispatch * optimize moe expert dispatch * remove redunctant code. * EP support, MTP fix * remove print * add alltoall backward * fix all2all bwd * support drop tokens, disable acc cal * fix bug * add base_model_prefix in ds3 modeling pp * fix ci shape mismatch * update
Author
Parents
Loading