DeepSpeed
e46d808a
- MoE inference + PR-MoE model support (#1705)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
3 years ago
MoE inference + PR-MoE model support (#1705) Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Zhewei Yao <zheweiy@berkeley.edu> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
References
#1705 - MoE inference + PR-MoE model support
Author
jeffra
Parents
3293cf72
Files
29
MANIFEST_win.in
csrc/transformer/inference
csrc
gelu.cu
pt_binding.cpp
includes
context.h
custom_cuda_layers.h
deepspeed
__init__.py
inference
engine.py
module_inject
replace_module.py
replace_policy.py
moe
experts.py
layer.py
sharded_moe.py
utils.py
ops/transformer
__init__.py
inference
__init__.py
moe_inference.py
transformer_inference.py
runtime
engine.py
fp16
fused_optimizer.py
utils.py
weight_quantizer.py
zero
stage_1_and_2.py
utils
groups.py
op_builder
__init__.py
builder.py
setup.py
tests/unit
test_checkpointing.py
test_moe.py
version.txt
Loading