Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
microsoft/DeepSpeed
Pull Requests
Commits
Open
Closed
Warn when zero.Init silently falls back to a single rank (#8084)
#8089 opened 2026-06-24 18:59 by
akshansh47
fix: use local ev_values and wrap dict.values() in list()
#8087 opened 2026-06-23 04:35 by
hashwnath
fix: add buffer-length check in shm.cpp
#8082 opened 2026-06-20 12:23 by
orbisai0security
fix: sanitize subprocess call in ds_aio_job.py
#8081 opened 2026-06-20 06:40 by
orbisai0security
ZeRO 1/2: wait on all IPG-bucket producer streams in average_tensor (#8061)
#8080 opened 2026-06-19 22:43 by
arunshar
feat: add Trackio as a new experiment monitoring backend
#8065 opened 2026-06-15 14:41 by
chanduripranav
Add AutoEP + AutoTP parallel folding
#8064 opened 2026-06-13 17:43 by
tohtana
[DeepCompile] fix gather params in dynamo skipped frames for ZeRO3
#8059 opened 2026-06-11 14:32 by
XAheli
feat(zenflow): run the overlapped CPU optimizer in a native process
#8058 opened 2026-06-10 21:34 by
Antlera
Fix eigenvalue parsing for compression-only quantize configs
#8057 opened 2026-06-10 07:44 by
sowndappan5
Add optional torchembed RoPE backend to apply_rotary_pos_emb
#8052 opened 2026-06-07 21:35 by
py-ai-dev
Fix incorrect variable name
#8051 opened 2026-06-07 18:21 by
Muneerali199
fix: log eigenvalue monitor values
#8049 opened 2026-06-05 08:29 by
he-yufeng
fix: log block eigenvalue summary events
#8048 opened 2026-06-04 23:21 by
he-yufeng
Fix minor comment/docstring typos in runtime and inference modules
#8046 opened 2026-06-03 08:21 by
nathon-lee
zero3: defer param release during retain_graph backward #7352
#8045 opened 2026-06-03 06:55 by
nathon-lee
[Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed
#8027 opened 2026-05-26 07:31 by
PKUWZP
Add Qwen 3.5 preset to AutoTP
#7978 opened 2026-04-16 12:51 by
tohtana
Refactor/torch autocast encapsulate global state
#7946 opened 2026-04-02 06:06 by
nathon-lee
Fix ZeRO-3 optimizer initialization validation (#7844)
#7929 opened 2026-03-28 16:20 by
amadhan882
Add torch_xla TPU support for ZeRO-1/2
#7917 opened 2026-03-21 18:43 by
PKUWZP
doc: Remove suggestion to build extensions in parallel
#7899 opened 2026-03-12 15:58 by
Flamefire
Fix Stage 0 + Ulysses crash: make bwc_tensor_model_parallel_rank() resilient to MP API absence
#7888 opened 2026-03-06 06:59 by
nathon-lee
fix(zero): Ensure full gradient reduction for Muon optimizer with reduce_scatter
#7878 opened 2026-02-27 06:46 by
nathon-lee
fix: correct DistributedAttention output shape and pad uneven sequence lengths (#7842)
#7868 opened 2026-02-22 11:00 by
harshang03
fix: keep fp32-pinned parameters out of the bf16 cast path in ZeRO-3 (#7747)
#7867 opened 2026-02-22 10:52 by
harshang03
Revert "fix: remove premature MPI environment variable check in OpenMPIRunner"
#7864 opened 2026-02-21 01:39 by
mikloorbi-sys
Fix global .cuh ignore and enforce tracked CUDA headers
#7858 opened 2026-02-18 04:38 by
harshang03
Fix ZeRO legacy grad-hook crash when next_functions is missing
#7857 opened 2026-02-17 22:07 by
harshang03
Reject non-finite fp16 loss_scale across config and ZeRO paths
#7856 opened 2026-02-17 18:13 by
harshang03
Older