DeepSpeed
7ae577cd
- Merge branch 'master' into fix-sp-dense
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Comment Changes
Previous Change (CTRL+↑)
Next Change (CTRL+↓)
Expand Context Lines
Collapse Context Lines
Hide Minimap (CTRL+M)
Commit
1 year ago
Merge branch 'master' into fix-sp-dense
References
#4530 - Fix the sequence-parallelism for the dense model architecture
Author
tjruwase
Parents
aaae9949
e2383511
Files
43
.github/workflows
amd-mi200.yml
README.md
accelerator
npu_accelerator.py
csrc
includes
conversion_utils.h
cublas_wrappers.h
ds_kernel_utils.h
feed_forward.h
gemm_test.h
general_kernels.h
quantizer.h
reduction_utils.h
strided_batch_gemm.h
lamb
fused_lamb_cuda_kernel.cu
quantization
fake_quantizer.cu
random_ltd
token_sort.cu
spatial/includes
spatial_cuda_layers.h
transformer
cublas_wrappers.cu
ds_transformer_cuda.cpp
inference
csrc
apply_rotary_pos_emb.cu
pt_binding.cpp
softmax.cu
transform.cu
includes
inference_context.h
inference_cublas_wrappers.h
inference_cuda_layers.h
deepspeed
comm
ccl.py
comm.py
launcher
multinode_runner.py
module_inject/containers
llama.py
ops/transformer/inference
config.py
op_binding
linear.py
softmax_context.py
runtime/zero
stage3.py
docs/assets/files
ICML-5mins.pdf
SC21-ZeRO-Infinity.pdf
presentation-mlops.pdf
sc22-ds-inference.pdf
op_builder
cpu_adagrad.py
cpu_adam.py
npu
__init__.py
fused_adam.py
random_ltd.py
transformer.py
Loading