DeepSpeed MoE (#1310)
Co-authored-by: Alex Muzio <Alex.Muzio@microsoft.com>
Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
Co-authored-by: Conglong Li <conglong.li@gmail.com>
Co-authored-by: Felipe Cruz Salinas <Andres.Cruz@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
Co-authored-by: Shaden Smith <shaden.smith@microsoft.com>
Co-authored-by: Young Jin Kim <youki@microsoft.com>
Co-authored-by: bapatra <bapatra@microsoft.com>
Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
Co-authored-by: Shaden Smith <shaden.smith@microsoft.com>
Co-authored-by: Young Jin Kim <youki@microsoft.com>