Support DeepSpeed offload and reload states with ZeRO1 and ZeRO2 #7421
LYMDLUT
changed the title Try to support deepspeed offload states with ZeRO2 Try to support deepspeed offload states with ZeRO1 and ZeRO2 304 days ago
LYMDLUT
changed the title Try to support deepspeed offload states with ZeRO1 and ZeRO2 Support deepspeed offload and reload states with ZeRO1 and ZeRO2 272 days ago
LYMDLUT
changed the title Support deepspeed offload and reload states with ZeRO1 and ZeRO2 Support DeepSpeed offload and reload states with ZeRO1 and ZeRO2. 272 days ago
LYMDLUT
changed the title Support DeepSpeed offload and reload states with ZeRO1 and ZeRO2. Support DeepSpeed offload and reload states with ZeRO1 and ZeRO2 272 days ago
Update stage_1_and_2.py
c723c328
Update stage_1_and_2.py
b808560d
Update engine.py
2e2e0583
Update offload_states.py
1740fe95
Align missing argument in AllReduceCoalescedHandle (#7414)
6f6bba44
Improvements to Communication Logger (#7404)
db550fc4
Add files via upload
1353e684
Update test_offload_states_zero2.py
f0f1ed7f
Update test_offload_states_zero2.py
c94fe1c6
Update stage_1_and_2.py
b1ae7f6e
Update test_offload_states_zero2.py
17c7f979
Update stage_1_and_2.py
5c4cfcf7
trying to fix nv-accelerate-v100.yml CI job (#7424)
cfb37836
Update stage_1_and_2.py
e9970a42
Update stage_1_and_2.py
b0ee1b43
fix: Propagate `strip_tensor_paddings` (#7426)
d2c19ed6
Use past_key_value when provided (#7428)
b7f98fe0
set `device_id` in torch's `init_process_group` (#7266)
2b9ac51d
[Ulysses-ALST] add FA3 support (#7430)
bba4756c
TiledMLP + SequenceTiledCompute: improve the bs>1 use-case (#7422)
7bcca9ac
Update test_offload_states_zero2.py
267281aa
Update test_offload_states_zero2.py
bb6769e1
Update test_offload_states_zero2.py
0348fe9d
Update stage_1_and_2.py
f69409e4
Update stage_1_and_2.py
61c84f19
Update stage_1_and_2.py
fcf950f8
Update stage_1_and_2.py
5215a444
Update stage_1_and_2.py
7137e7b9
Update stage_1_and_2.py
fb2c3699
Update test_offload_states_zero2.py
277e6261
Update test_offload_states.py
268a7096
Update test_offload_states.py
23ed5b8b
Update stage_1_and_2.py
4e5f24f5
Update stage_1_and_2.py
8053b8e0
Update stage_1_and_2.py
1d6327d0
Update stage_1_and_2.py
5efd58e0
Update test_offload_states_zero2.py
b24de28b
Remove tests from README that are already removed. (#7441)
abcf2186
[ALST] fix typo in the url (#7444)
dbc4b7dd
[ALST] fix typo in the url part2 (#7446)
c605f546
Remove additional unused tests (human-eval) (#7445)
85d5efd1
Fix: Adapt Llama injection policy for newer transformers versions (#7…
3d747ef5
Update version.txt after 0.17.3 release. (#7455)
d3a477e9
Fix: UnboundLocalError for variable 'dim' about issue (#7449)
f13d098c
adding TiledFusedLogitsLoss (#7437)
26551631
`TiledFusedLogitsLoss` bug fix (#7459)
fc9efa0f
Update version.txt after v0.17.4 release
947bdd72
Revert "Update version.txt after v0.17.4 release"
3a11e34e
Update version.txt after v0.17.4 release (#7460)
0e5e1604
Update README.md (#7465)
243f48eb
Add getter APIs for TP/PP/DP ranks in DeepSpeedEngine (#7427)
27b24f06
fix issues raised by Coverity scans (#7431)
4f9a9a04
Fix all-gather duplicate params and wrong dtype (#7462)
2255f5fd
fix #7188 (#7371)
984386ce
add --bind_cores_to_rank to zero offload tutorial (#7474)
8516f9fc
Add blog for ZenFlow (#7463)
376c5b7a
Fix cpu CI (#7481)
36b925ab
fix `deepspeed --venv_script` (#7469)
732ed3c4
Modal CI (#7289)
61681ce2
[UlyssesSPDataLoaderAdapter] fix iterator reset (#7472)
c756078b
[TiledFusedLogitsLoss] support inference (#7477)
cc5261d0
Update test_offload_states_zero2.py
fde1035f
Update test_offload_states_zero2.py
d16aa8a8
Update stage_1_and_2.py
f5f7d494
Fix pre-compile on cpu-only machines (#7168)
8a0d2262
Enable forked PRs (#7486)
4ffb4426
fix xpu device_id AttributeError issue (#7488)
d0db8f80
Add Zenflow code for Stage 1 & 2 (#7391)
672f326a
Fix invalid f-strings (#7457)
91bb16bb
Fix DeepCompile for PyTorch v2.8 (#7496)
5777e6cb
Reduce performance impact of compiler.enable decorator (#7498)
0a6ff078
Add index to HPU devices (#7497)
2769d2a3
Delete tests/unit/runtime/zero/test_offload_states_zero2.py
6ef88b9b
LYMDLUT
force pushed
from
b9385872
to
6ef88b9b
271 days ago
Merge branch 'master' into master
f2dcffb0
Format fixes
fbefa4ab
Merge branch 'master' into master
829b742b
loadams
enabled auto-merge (squash) 270 days ago
loadams
merged
bc8c0db3
into master 270 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub