Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
microsoft/DeepSpeed
Pull Requests
Commits
ulysses-offload-tutorial
AutoPR/0.12.2
AutoPR/0.14.0
AutoPR/0.14.5
CUDA-Graph-support
HeyangQin/deepspeed-ulysses-chinese-blog
HeyangQin/enable_hpz_nograd
HeyangQin/fastgen_moe_h100
HeyangQin/fix_hpz_nograd
HeyangQin/fix_issue_3062
HeyangQin/fix_issue_3068
HeyangQin/fix_issue_3156
HeyangQin/fix_issue_5205
HeyangQin/fix_pr_3462_standalone
HeyangQin/hpz_convergence
HeyangQin/inference_t5_phase1
HeyangQin/mixed_precision_lora_sam
HeyangQin/mixz_tutorial
HeyangQin/skip_bias_quant
HeyangQin/staging-zero-pp-v1
HeyangQin/ucp_blog_chinese
HeyangQin/ulysses_fp8
Megtron-Kernel-Integration
SA_feature_tag
SA_tutorial_update
SA_update_tutorial_link
add-bfp16-support
add-comm-layout
add-inference-comm
add-llama2-support
add-quantizer
add-shared-lib
adk9/phi3-inference
adk9/phi3-small
adk9/update-minor-cuda
amawa/add-moe-container
amawa/aml-get-hosts
amawa/auto-save-ckpt
amawa/config-pass-down
amawa/debug
amawa/fix-amd-rocm
amawa/fix-auto-tp-load-ckpt
amawa/fix-tracer-zero3
amawa/fix-z3-for-hf-accelerate
amawa/fix-z3-warn-print-v2
amawa/inference-fix
amawa/remove-deepcopy
amawa/split-a2a
amawa/zero-inf-refactor
amawa/1-bit-alltoall
amawa/1bit-adam-nccl
amd-jiting
aml-autotuner
arashb/fix-phi-2
arashb-patch-1
arpan/auto-check
autocast-fix
awan-10-patch-1
awan-10-patch-2
awan-10-patch-3
azure
big-science
big-science-v2
bing/debugging
bing/ds-adam
bing/formatting-correction
bing/io-tutorial
bing/modify-ds-optimizer
bing/optimizer-naming
bloom-debug
chatgpt-chinese-blog
check-linear-sizes
cholmes/activation-utils
cholmes/checkpoints-inference-v2-2
cholmes/comm-group-cache
cholmes/fix_reduction_utils_amd
cholmes/fix-asym-quant
cholmes/isolate-src-code
cholmes/kv-cache-flexibility
cholmes/mem-access-predicated-load
cholmes/migrate-to-dequant-lib
cholmes/pipelined-quant
cholmes/reduce-quantized-gpus
cholmes/sd-extension
cholmes/ts-builder
cholmes/unique-cuda-graphs
ckpt-fix-unfused
clean-llama
clean-llama-v2
clean-opt
clean-opt-base
clean-opt-v2-base
clean-opt-v2
codegen-inference
comm-opt2
costineseanu/windows_inference_build
cpu-adam/optional_CUDA-copy
debug-base-attn
debug-ds-inf
debug-ds-inf-torch-matmul
ds-chat-blog-8-31
ds-chat-clean-opt
ds-chat-news
ds-chat-release
ds-inference/add-falcon-support
ds-inference/bloom-support-meta
ds-inference/fix-generation
ds-inference/fix-mp
ds-inference/remove-randgen
ds-inference/simplify
ds-inference/support-large-token-length
ds-seq-tutorial
ds-vchat-blog-v1
ds-vchat-blog-v2
duli/capability
duli/cuda_op_builder
duli/op_builder
duli/pre_post
duli/zero_debugging
elastic-ckpt-refresh
elasticity-v2
eltonz/copy_grad_stream
enable-neox
encoded-ds-config
fairseq-moe
fairseq-moe-debug
falcon-180b
fastgen-blog
fastgen-blog-2
features/rebase-quant-fp6
fix_mpu_ckpt
fix-MoQ
fix-autotuning-docs
fix-autotuning-exit
fix-autotuning-reqs
fix-flops-profiler
fix-fp16-test
fix-injection
fix-max_train_batch_size
fix-misaligned-grad
fix-moe-top1gating
fix-sp-dense
fix-sparse-attn
fix-tuner-prescale_gradients
fix-tuner-scheduler-bug
fix-twitter
fix-typos
flash-attention
flops-profiler-skip-unused-args
fp6-blog
fs/soft-kernel
fs-82
fs-soft-kernel
fs-z2-fix
gcooper/make_optimizer_optional
generic-ckpt-loading
gh-pages
gh-readonly-queue/master/pr-3852-3491e32d72746ec3d990108a23e67b2666b3e0e0
gh-readonly-queue/master/pr-3852-adb9bc14b780115fd54f3f1234abcb7ab52fa975
gh-readonly-queue/master/pr-3854-85503dab878875175b6d5eb6a39125878c172273
gh-readonly-queue/master/pr-3892-9f8817b2425bb82d9b6355caa6d2d0ebd036885d
gh-readonly-queue/master/pr-3892-548451ba4e8ea71029d738c33f639e0439aad1dd
gh-readonly-queue/master/pr-3893-cc71eec8c85c4437d8139e53372da7f22224fed5
gh-readonly-queue/master/pr-3928-82115d9059ce8271229c8f63153a02f2d323cfc1
gh-readonly-queue/master/pr-4163-5e16eb2c939707d0d0062a458d77998fccb3afad
gma/xpu_compile_analysis
good-moe
gpt2-debug
guanhua/adam-timer
guanhua/adam-timer2
guanhua/check-bf16
guanhua/fix-cutlass-ver
guanhua/h2d-offload
guanhua/kernel-test
guanhua/mics-fix
guanhua/overflow-check
guanhua/quant-dequant-test
guanhua/quant-test
guanhua/rocm-cpu-adam
guanhua/v14.0-bf16-check
hf-workaround
hp-sam
hpzero-preview
inference/ElutherAI-GPTJ
inference/TP-general-support
inference/add-bf16-support
inference/engine-api
inference/fix-masking
inference/fix-mp-init
inference/support-encoder-decoder
inference-api/tutorial
inference-read-checkpoint
inference-refactor-v1-mro-test
injection-fixes
jeff-test
jeffra/auto-bucket
jeffra/available_memory
jeffra/bf16-updates
jeffra/bf16-updates-v2
jeffra/ci-updates
jeffra/ckpt-barrier
jeffra/docker-update
jeffra/engine-xthru
jeffra/engine-xthru-v2-no-padding
jeffra/engine-xthru-v2
jeffra/external-skip
jeffra/fix-1416
jeffra/fs-diverge
jeffra/fs-gas-fix
jeffra/fs-gas-fix-v2
jeffra/fs-support
jeffra/fs-z3-v0510
jeffra/fs-z3
jeffra/gptj-fixes
jeffra/inf-engine-refactor
jeffra/inf-tests
jeffra/jit-fix
jeffra/latest-hf
jeffra/op-build-api
jeffra/prepost_fwd_and_generate
jeffra/saksham-zero1-fixes
jeffra/savepid2
jeffra/shm-report
jeffra/staging-comms-logging-v1
jeffra/turn-on-opt-test
jeffra/update-z3-check
jeffra/z1-refresh
jeffra/z1-refresh-2
jeffra/z1-refresh-3
jeffra/z3-fix
jeffra/z3-new-param
jeffra/zero1-grad-norm
jeffra/zero-1-fix
jeffra/zero-1-fix-test
jeffra/zero-ckpt-fixes
jeffra/zero-moe-noCG
jeffra/1node-launcher-fix
jeffra/2904
jeffra-patch-2
jerasley/mac
jomayeri/aio-locked-tensor
jomayeri/aio-mem-fix
jomayeri/aio-op-parallel
jomayeri/aio-type-mismatch
jomayeri/bf16-zero-check
jomayeri/bug-5880
jomayeri/debug-2361
jomayeri/deepnvme-perf-debug
jomayeri/destroy-zero
jomayeri/fp8-init
jomayeri/gds-swapper-fix
jomayeri/h100-unittest
jomayeri/he-mp-assert
jomayeri/issue-3367
jomayeri/issue-3560
jomayeri/issue-3598
jomayeri/issue-3769
jomayeri/issue-4083
jomayeri/issue-4095
jomayeri/issue-4183
jomayeri/issue-5087
jomayeri/lr-step-init
jomayeri/lr-step-move
jomayeri/model-param-list
jomayeri/new-zero-accum
jomayeri/swap-with-locked
jomayeri/zero3-hooks
jomayeri/zero-grad-accum
kv-cache-reset
landing-training
landing-updates
lekurile/add_ds_chat_workflow
lekurile/add_hip_abstraction
lekurile/clean_up_params
lekurile/container_param_cleanup
lekurile/debug_bloom
lekurile/ds_chat_attn_mlp_base
lekurile/ds_chat_fix_test
lekurile/ds_chat_gh_wf
lekurile/ds_chat_mlp_debug
lekurile/ds_chat_opt_fix
lekurile/ds_chat_revert_54c06872
lekurile/ds_chat_test_exit_first
lekurile/ds_chat_test_f69f8840
lekurile/ds_chat_test_7b5b0660
lekurile/ds_chat_test_54c06872
lekurile/fix_ds_chat_bloom
lekurile/fix_formatting
lekurile/fix_he_print
lekurile/fix_issue_2330
lekurile/fix_opt_meta_tensor
lekurile/fix_phi_2
lekurile/fix_sd_ci
lekurile/fix_sd
lekurile/fix_unet_vae
lekurile/general_local_cg
lekurile/infv2_lm_eval
lekurile/kernel_hip_amd
lekurile/load_ckpt_inf_eng
lekurile/mlp_functions
lekurile/offload_fix_test
lekurile/sd_min_ver
lekurile/test_rearrange_ops
lekurile/update_ds_chat_ci_test
lekurile/update_ds_chat_ci
lekurile/update_ds_chat_ci_2
lekurile/update_dschat_wf
lekurile/update_inf_ckpt_load
lf-test
loadams/a6000-fix
loadams/a6000-fix-0-15-2
loadams/accelerator-test
loadams/adam-params
loadams/add-contributing-release-md-files
loadams/add-gaudi-badge-readme
loadams/add-scheduled-open-issue-check-ds-chat
loadams/add-torch-2-support
loadams/amd-57
loadams/amd-mi200-tests
loadams/amd-pre-compile
loadams/amd-updates
loadams/auto-stage3-prefetch-bucket-size
loadams/auto-task-open-failure
loadams/azure-blob-storage
loadams/build-for-cpu
loadams/changes-to-op-builder
loadams/check-accelerate
loadams/check-ds-chat-transformers-debug
loadams/check-pydantic-v2-support
loadams/cleanup
loadams/clear-cache
loadams/cpu-inf
loadams/cpu-inf-triggers
loadams/cpu-inf-v0-docker
loadams/cpu-inference-shorten
loadams/cpu-runner-debug
loadams/cpu-torch
loadams/cpu-torch-latest-fix-debug
loadams/cu118
loadams/cuda-compilation-nv-bfloat162
loadams/dc-test
loadams/debug-opbuilder
loadams/debug-torch
loadams/disable-h100-ci
loadams/disable-libaio
loadams/disable-windows-ops-build-script
loadams/dot-deepspeed_env-test
loadams/dpkg-libaio
loadams/ds-chat-fixes-test
loadams/empty-env-var-setup
loadams/enable-amdmi200
loadams/enable-python
loadams/enable-workflow-dispatch-nv-torch-nightly-v100
loadams/engine-pos-args
loadams/fix-a6000-debug
loadams/fix-a6000-transformers
loadams/fix-check-valid-version
loadams/fix-cpu-inf-test-time
loadams/fix-cuda-build-ops
loadams/fix-docs-rendering
loadams/fix-ds-chat
loadams/fix-fp16-bf16-logging-issue
loadams/fix-hpu
loadams/fix-lightning-pytorch2
loadams/fix-mpi4py
loadams/fix-nccl-comm-torch-check
loadams/fix-no-torch-failure-mlu
loadams/fix-nv-inference
loadams/fix-nv-inference-hang
loadams/fix-nv-torch-latest-v100
loadams/fix-onebit-skip
loadams/fix-torch-2
loadams/fix-torch-compiler-hasattr
loadams/fix-torch-linalg-norm
loadams/fix-triggers-no-torch-workflow
loadams/flops-profiler-scaled-dor-attn-torch-2
loadams/get-amd-team-ci
loadams/get-logs-ci-failure
loadams/gh-cpu-inf
loadams/gh-release-version-update
loadams/hf-transformers-ci-fix
loadams/hpu-uts
loadams/ignore-unused-params-default
loadams/inference-ops-test-repro
loadams/inference-transformers-enable
loadams/lamb-bf16
loadams/libaio
loadams/low-cpu-mem-ut
loadams/lsb-release
loadams/megatron
loadams/megatron-lm-112
loadams/megatron-new-pypi
loadams/megatron-version
loadams/mii-transformers-debug
loadams/more-torch-2-support
loadams/nv-inf-jobs-test
loadams/nv-inf-test
loadams/nv-inference-revert
loadams/nv-nightly
loadams/nv-nightly-fix-transformers
loadams/nv-sd-badge
loadams/opbuildertest
loadams/openmpi-eth0
loadams/pin-torch-latest-ver
loadams/pip-ver
loadams/pre-compile-test
loadams/py36
loadams/pynvml
loadams/pyproject
loadams/pyproject-toml
loadams/pyproject-toml-tests
loadams/recurse-flops-profiler
loadams/reenable-cpu-inference
loadams/reenable-py311-312
loadams/remove-dead-code
loadams/remove-modeling
loadams/remove-python-36-check
loadams/rename-fp-quantize-cu
loadams/rename-nv-torch-latest-cpu-workflow
loadams/revert-4660
loadams/revert-5608
loadams/revert-cpu-inf
loadams/revert-loss
loadams/revert-nv-inference-changes
loadams/revert-pr-5608
loadams/revert-userwarning
loadams/rocm6
loadams/rocm57
loadams/rocm-fixes
loadams/sd-fixes
loadams/sd-paths
loadams/sequential-2
loadams/setup-h100-triggers
loadams/shuffle-data-sampler
loadams/shuffle-true
loadams/shuffle-true-dataloader
loadams/sigterm
loadams/skip-nv-inference
loadams/sparse-attn-fix
loadams/sparse-attn-torch-2
loadams/stablediffusion-test-triton2
loadams/switch-modeling-compression
loadams/tar-vuln
loadams/test-0.15.0
loadams/test-amp-futurewarning
loadams/test-b421e8c8f31af254b63ad6e9839f617ab6d9c060
loadams/test-ccl-fixes
loadams/test-compile
loadams/test-cpu
loadams/test-cpu-inf-fix
loadams/test-f0e3f01d7c7a3d8748212e61eaf487fab41168a7
loadams/test-fix-nv-inference
loadams/test-glibc228
loadams/test-hpu-update-192
loadams/test-merged-changes
loadams/test-model-task
loadams/test-new-numpy
loadams/test-nv-ds-chat-failure-mode
loadams/test-nv-latest-cpu
loadams/test-nv-torch-latest-v100
loadams/test-pydantic-update
loadams/test-pytest-ordering
loadams/test-runsc
loadams/test-toml
loadams/test-toml-2
loadams/test-torch-2.3.0
loadams/test-torch-2.7
loadams/test-transformers-inference
loadams/test-xpu-builds
loadams/torch-cpu-mismatch-cudaopbuilder
loadams/torch-linalg-vectornorm
loadams/torch-nightly-debug
loadams/transformers-ds-chat-debug
loadams/transformers-fixes
loadams/transformers-latest
loadams/transformers-torch
loadams/transformers-torch-update
loadams/transformers-workflow-dispatch
loadams/triton-22-update
loadams/try-bump-pydantic
loadams/unpin-hf-transformers-nv-workflows
loadams/unpin-nv-torch-latest
loadams/unpin-transformers
loadams/unpin-transformers-hpu
loadams/unpin-transformers-latest
loadams/unpin-transformers-latest-a6000
loadams/update-2004-checkout-actions
loadams/update-a6000-workflows
loadams/update-accelerate
loadams/update-amd-required-paths
loadams/update-classifiers
loadams/update-conda-pydantic
loadams/update-container-a6000
loadams/update-container-pre-compile
loadams/update-docker
loadams/update-docker-nv-sd
loadams/update-dockerfile
loadams/update-flake8
loadams/update-governance
loadams/update-hostname-I
loadams/update-hpu-1-18
loadams/update-hpu-docker-container
loadams/update-hpu-docker-image
loadams/update-hpu-gaudi-flow-more
loadams/update-just-nv-a6000-container
loadams/update-mii-transformers
loadams/update-nodejs-reate-pr-action
loadams/update-nv-accelerate
loadams/update-nv-inference-torch-ver
loadams/update-nv-lightning-test-cu-ver
loadams/update-nv-torch-latest-cpu-torch-ver
loadams/update-nv-torch-latest-cpu-version
loadams/update-pre-compile-ops-docker
loadams/update-pydantic
loadams/update-pyproject-toml
loadams/update-pytest
loadams/update-pytest-error-codes
loadams/update-real-latest
loadams/update-sd-triton
loadams/update-torch-27
loadams/update-torch-latest-27
loadams/update-transformers
loadams/update-transformers-cu116
loadams/update-version-txt-post-release
loadams/update-website-sidebar
loadams/update-whl-build-commands
loadams/x86-accelerator
loadams/xpu-readme
loadams/xpu-test
loadams/xpu-yml
lokoppak/ln_schedule_update
lokoppak/low_cpu_mem_usage_ut
lokoppak/new_pt_binding
lokoppak/quantization_3d
lokoppak/ref_ln
lsh
master
master-test
megatron2.4-3d
minjiaz/ds-seq-tutorial
minjiaz/moe-comm
minjiaz/moe-sharing
moe-full-tp
moe-inference/add-tutorial
moe-inference-tutorial
moe-inference-tutorial1
moe-pipelining
moe-timing
mosm/autotp_llama
mosm/autotp-he
mosm/bloom_dev
mosm/codegen
mosm/debug-ds-attn
mosm/debugger
mosm/dschat-news
mosm/inf-refactor
mosm/llama2
mosm/matmul_test
mosm/module_parser
mosm/mp_tutorial
mosm/opt-kernel
mosm/softmax
mosm/softmax-longseq
mosm/t5
mosm/test
mosm/tp_dev
mosm/wb-param
mrwyattii/expand-fp16-tests
mrwyattii/fix-accelerate-tests
mrwyattii/fix-for-mii-UT
mrwyattii/fix-inference-skipped-tests
mrwyattii/fix-launcher-user-args
mrwyattii/fix-multi-node-checks
mrwyattii/pin-datasets
mrwyattii/pydantic-2-support
mrwyattii/remove-symlinks
mrwyattii/rename-cpu-accelerator
mrwyattii/safetensor
mrwyattii/silence-backend-warning
mrwyattii/update-GH-permission
mrwyattii/update-MII-tests-infV2
multi-z3-prs
multi-z3-prs-r2
mwyatt/fp-quant-debug
mz/llama-support
neox-q-int8
niumanar/gan_optimizer
offloadpp-news
olruwase/accelerator_abstraction
olruwase/adam_types
olruwase/align_rrg_rs_param_order
olruwase/all_gather_profiling
olruwase/amd_configurable_pp_rtol
olruwase/assert_unused_parameters
olruwase/b16-debugging
olruwase/bf16-updates-2
olruwase/bf16_tied_weights_reduce
olruwase/bf16_update_hp_params
olruwase/bloom_176b_checkpoint_bc
olruwase/bloom-support
olruwase/build_compat_ops
olruwase/ci_pytorch_1x
olruwase/deepnvme_abstract_class
olruwase/deepnvme_docs
olruwase/disable_prefetch_profiler
olruwase/disable_z3_prefetcher
olruwase/dnvme_docs
olruwase/ds_2449
olruwase/ds_2921
olruwase/ds_3481
olruwase/ds_3680_2
olruwase/ds_3948
olruwase/ds_4998
olruwase/ds_5241
olruwase/ds_7150
olruwase/dynamic_graph_activation_checkpoint
olruwase/elastic-ckpt-refresh
olruwase/engine_destroy
olruwase/fast_persist
olruwase/fix_kernel_memory_bloat
olruwase/frozen_weights_unit_test
olruwase/fs_z3_trace_error_disable
olruwase/fs_z3_trace_log
olruwase/fs-zero3_trace_fix
olruwase/fuse_torch_adam_w
olruwase/gpt3-finetuning
olruwase/grad_accum_loss
olruwase/issue_3062
olruwase/llama2_empty_group
olruwase/local_storage_checkpoint
olruwase/lr_warmup_decay
olruwase/non_tensor_activation_checkpoint
olruwase/nvme_finetune
olruwase/nvme_offload_bug
olruwase/nvme_perf_sweep
olruwase/nvme_testsuite
olruwase/override_module_apply
olruwase/pr_6772
olruwase/refactor_universal_checkpoint
olruwase/restore_from_bit16_weights
olruwase/round_robin_gradient_option
olruwase/safe_pkg_check
olruwase/safe_py_subprocess
olruwase/save_checkpoint_latest_false
olruwase/save_zero3_fp16_weights
olruwase/set_zero_opt_grad
olruwase/setup_env_libaio
olruwase/trainable_parameters
olruwase/windows_blog
olruwase/z3_perf_tune
olruwase/z3_suppress_warning
olruwase/zcode_model_expert
olruwase/zero_inference_tokgen
olruwase/zero_inference_torch_version
olruwase/zero_offload_e2e
olruwase/zero_offload_fix_corner_case
olruwase/zero_offload_v3
olruwase/zero_optional_reduce_scatter
olruwase/zero_stage1_checkpoint_layout
olruwase/zero_stage1_elastic_checkpoint
olruwase/zero1_non_tensor_checkpoint
olruwase/zero2_grad_accum_bug
olruwase/zero2_offload_keyerror
olruwase/zero2_offload_rrb_divergence
olruwase/zero2_offload_slowdown
olruwase/zero2_trainable_parameters_v0.5.7
olruwase/zero2_trainable_parameters
olruwase/zero2_unbalanced_grad_reduction
olruwase/zero3_amp_autocast
olruwase/zero3_broken_tracing
olruwase/zero3_dp_norm_allreduce
olruwase/zero3_profile_fetch
olruwase/zero3_unboundlocal_bug
olruwase/zinf_none_swapper
paper
patch-z1-cont-grad
pr_moe_tutorial
preserve-CVDs
profiler-add-shape
qanthony/bigbird
qanthony/comms-bench
qanthony/nccl-backend
quantization-refresh
quantize-inference
refine-quantizer
remotes/origin/dev/tput
remove-tbx
remove-unused-quantize-settings
reyazda/adam-scalar-fix
reyazda/cpu_adam_jit_v2
reyazda/fix-inference-api
reyazda/pytorch-workspace-allocate
reyazda/remove_bertid
reyazda/support_AVX2_by_default
reyazda/test-hidden-dimension
reyazda/test-sparse
reyazda/test-sparse-v2
reyazda/test-transformer
reyazda/testing_embedding
reyazda/triton-new-sparse
reza/deepspeed_adam_merge_v3
reza/fix_adam_corner_case
reza/fix_adam_perf
reza/fix-adam-copyfp16
reza/megatron_kernel_integration
rtd-staging
saforem2/fix-missing-packages
saforem2/ucp-bug
saksham-zero1-fixes
samyam-overlap-comm
samyamr/elasticity
samyamr/fix-for-fragmented-linear-inputs
samyamr/gpt3-finetuning
samyamr/gpt3-finetuning-mixed-precision
samyamr/stage3-alignment-fix
samyamr/zero-2-debug
security-patch
shaden/textgen
smartreply_hotfix
sp/comm-opt
sp-mpu
sparse-attn/support-latest-triton
sparse-attn-cuda11
staging-amd
staging-amd-port
staging-amd-v2
staging-amd-v3
staging-comms-next-v2
staging-comms-v1
staging-deepnvme-gds-v1
staging-demo-feature-v0
staging-ds-chat-blog-v1
staging-ds-seq-v1
staging-inference-v2-5
staging-mii-update
staging-moe-next-v1
staging-oaas
staging-pld-v1
staging-pp
staging-test
staging-zero-dual-v2
staging-zero-dual-v3
staging-zero-dual-v5
staging-zero-inference-v1
stale-issues
stas00-dist-init-device-id
stas00-patch-1
stas00-patch-2
stas00-patch-3
styoun/triton2.1-autotune
styoun/triton2.1
styoun/triton-flash2
styoun/zero-inf-8bit-q
subprocess-test
test-ac
test-cuda-11.7
tjruwase/modal_ci
tmp
tmp-old
tohtana/add_slides_meetup_japan
tohtana/allocate_test_port
tohtana/autocast_only_floating_values
tohtana/bcast_input
tohtana/bcast_warning_z3
tohtana/blog_win_jp
tohtana/cache_kv_requirements
tohtana/clean_after_test
tohtana/clean_all_param_coordinators
tohtana/clean_up_prefetch_param
tohtana/compile_no_grad
tohtana/compile-zero
tohtana/consistent_zero_grad
tohtana/dc_fix_symint_input
tohtana/dc_offload_debug
tohtana/debug_compile_backends
tohtana/debug_semaphore_leak
tohtana/deepcompile_fix_scheduling
tohtana/deepcompile_fix_selective_gather
tohtana/file_store_for_tests
tohtana/fix_bf16_opt_update_hp
tohtana/fix_chkpt_alignment
tohtana/fix_sort_dp_univ_ckpt
tohtana/fix_univ_chkpt_load
tohtana/fix_zero_init_patch
tohtana/fix-save-checkpoint-step
tohtana/get_offload_state_api
tohtana/lock_hf_cache_update
tohtana/log_run_tests
tohtana/merge_FPDT
tohtana/model_declaration_in_init_context
tohtana/offload_zero_buffers
tohtana/pipeline_with_compiled_module
tohtana/remove_step_on_init
tohtana/simplify_param_coordinator
tohtana/support_autocast
tohtana/test_with_pt25
tohtana/univ_ckpt_custom_shape
tohtana/z3_multi_dtypes
tohtana/z3_no_mixed_precision
token-drop
transformer/fix-layer-norm
transformer/injection
transformer/large-seq-support
transformer/triangular-mask
transformer-injection
transformer-kernel/support-arbitrary-hidden
triton-fix
ucp_blog
ulysses
ulysses-offload-tutorial
ulyssess-offload-blog
umchand/test_compiler
umchand/triton/bias_act
unify-benchmark-knowledge
update-flops-profiler-doc
update-flops-profiler-pool-compute
workaround-zero3
z1-offload-multigpu
z3-mem-leak
zero-ckpt-cpu-issue-v2
zhenyzhang-data
zheweiyao/quantize_update
zhipwang_dev
Merge branch 'master' into ulysses-offload-tutorial
loadams
committed
169 days ago
Verified
e7e6e007
Pin transformers to avoid errors with latest version (#6820)
loadams
committed
169 days ago
Verified
b966e1f9
Merge branch 'master' into ulysses-offload-tutorial
loadams
committed
169 days ago
Verified
ef8d5776
Adding the new feature of FPDT (#6462)
YJHMITWEB
committed
169 days ago
Verified
60a1b57b
Update python version but now we need to include setuptools on our own (#6787)
loadams
committed
170 days ago
Verified
ed7d183b
Merge branch 'master' into ulysses-offload-tutorial
loadams
committed
170 days ago
Verified
152fcb3a
Update formatting
loadams
committed
170 days ago
7795fd3b
Fix zero checkpoint (#6792)
xu-song
committed
170 days ago
Verified
fc230070
update ulysses-offload tutorial
Jinghan Yao
committed
171 days ago
203ef63f
update ulysses-offload tutorial
Jinghan Yao
committed
171 days ago
8e1fb0fe
update ulysses-offload tutorial
Jinghan Yao
committed
171 days ago
39a97cd2
update ulysses-offload tutorial
Jinghan Yao
committed
171 days ago
de18e222
update ulysses-offload tutorial
Jinghan Yao
committed
171 days ago
c272c9fd
Codespell
loadams
committed
171 days ago
681a83f4
Formatting
loadams
committed
171 days ago
bb7646db
Merge branch 'master' into ulysses-offload-tutorial
loadams
committed
171 days ago
Verified
7ab10113
Domino news update on readme.md (#6815)
GuanhuaWang
committed
171 days ago
Verified
0c6c9811
Merge branch 'ulysses-offload-tutorial' of github.com:microsoft/DeepSpeed into ulysses-offload-tutorial
Jinghan Yao
committed
171 days ago
79609fbb
add fpdt tutorial link
Jinghan Yao
committed
171 days ago
cc680209
Update fpdt.md
samadejacobs
committed
172 days ago
Verified
45fee407
add FPDT tutorial
Jinghan Yao
committed
178 days ago
600c5c4f
Update version.txt after 0.16.0 release (#6786)
loadams
committed
179 days ago
Verified
f743feca
Revert release workflow (#6785)
loadams
committed
179 days ago
Verified
e5570b10
Update version.txt before release (#6784)
loadams
committed
179 days ago
Verified
03845dbc
Domino Blog (#6776)
GuanhuaWang
committed
179 days ago
Verified
ec6cc490
Cleanup code docs warnings (#6783)
loadams
committed
179 days ago
Verified
fabcf407
Fix Doc Error: ZeRO Stage 2 gradient partitioning (#6775)
yewentao256
committed
179 days ago
Verified
d6410f90
docs: fix HF links (#6780)
imba-tjd
committed
179 days ago
Verified
5e16f255
Unpin with latest transformers fixes (#6763)
loadams
committed
182 days ago
Verified
f57b1ef1
Fix potential memory issues when use deepspeed Z3 (#6726)
wenbinc-Bin
committed
183 days ago
Verified
cd20a3bb
Older