DeepSpeed
e46d808a - MoE inference + PR-MoE model support (#1705)

Commit
3 years ago
MoE inference + PR-MoE model support (#1705) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Zhewei Yao <zheweiy@berkeley.edu> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
Author
Parents
  • MANIFEST_win.in
  • csrc/transformer/inference
    • csrc
      • File
        gelu.cu
      • File
        pt_binding.cpp
    • includes
      • File
        context.h
      • File
        custom_cuda_layers.h
  • deepspeed
    • File
      __init__.py
    • inference
      • File
        engine.py
    • module_inject
      • File
        replace_module.py
      • File
        replace_policy.py
    • moe
      • File
        experts.py
      • File
        layer.py
      • File
        sharded_moe.py
      • File
        utils.py
    • ops/transformer
      • File
        __init__.py
      • inference
        • File
          __init__.py
        • File
          moe_inference.py
        • File
          transformer_inference.py
    • runtime
      • File
        engine.py
      • fp16
        • File
          fused_optimizer.py
      • File
        utils.py
      • File
        weight_quantizer.py
      • zero
        • File
          stage_1_and_2.py
    • utils
      • File
        groups.py
  • op_builder
    • File
      __init__.py
    • File
      builder.py
  • File
    setup.py
  • tests/unit
    • File
      test_checkpointing.py
    • File
      test_moe.py
  • File
    version.txt