Enable tensor fragments for zero 2 & 3 #2727
Enable tensor fragments for zero 2
bd0f7da9
Merge branch 'master' into olruwase/tensor_fragments
830570e8
stas00
commented
on 2023-01-23
Update deepspeed/utils/tensor_fragment.py
6acbd537
stas00
commented
on 2023-01-23
Update deepspeed/utils/tensor_fragment.py
ea4f2098
Merge branch 'master' into olruwase/tensor_fragments
4b5373cc
Merge branch 'master' into olruwase/tensor_fragments
026ff96d
Support offload
9bd542c1
Support offload
8a240260
Merge branch 'master' into olruwase/tensor_fragments
b4b0f9f4
Support multi-gpu
864b4c95
Cleanup
43a0aed0
WIP
c2730cc1
stas00
commented
on 2023-01-26
Merge branch 'master' into olruwase/tensor_fragments
923b2249
Update deepspeed/runtime/zero/stage3.py
d1492458
tjruwase
changed the title Enable tensor fragments for zero 2 Enable tensor fragments for zero 2 & 3 2 years ago
Master rebase
5837d162
Merge branch 'master' into olruwase/tensor_fragments
187d166b
Support padding
0f636c8b
Merge branch 'olruwase/tensor_fragments' of github.com:microsoft/Deep…
5cfc6cc1
stas00
commented
on 2023-01-26
Update deepspeed/runtime/zero/stage3.py
458bf504
z3 optimizer state support; aligned api
4738aacd
Merge branch 'master' into olruwase/tensor_fragments
730c4590
Support frozen z3 params
0f81194d
Merge branch 'olruwase/tensor_fragments' of github.com:microsoft/Deep…
d15a7862
Merge branch 'master' into olruwase/tensor_fragments
ad10c372
Unit tests
59dad7ce
Merge branch 'olruwase/tensor_fragments' of github.com:microsoft/Deep…
b903e631
Merge branch 'master' into olruwase/tensor_fragments
8034117d
Check NVMe offload capability
448fa96b
Merge branch 'olruwase/tensor_fragments' of github.com:microsoft/Deep…
0c6ccdc5
Formatting
f02e3be4
Merge branch 'master' into olruwase/tensor_fragments
6cced794
Docs
d2ca3e00
Merge branch 'master' into olruwase/tensor_fragments
522d5dd4
More docs
6c1217aa
Merge branch 'olruwase/tensor_fragments' of github.com:microsoft/Deep…
686d18c0
More docs
53022b0f
Merge branch 'master' into olruwase/tensor_fragments
b06ec44d
stas00
commented
on 2023-02-03
Update docs/code-docs/source/zero3.rst
6d632cb7
stas00
commented
on 2023-02-03
More docs
d0c99612
Update docs/code-docs/source/zero3.rst
9a1812ad
More docs
a81a5435
stas00
commented
on 2023-02-03
More docs
7303e2cd
stas00
commented
on 2023-02-03
Update docs/code-docs/source/zero3.rst
2d5bdd6b
stas00
commented
on 2023-02-04
Update deepspeed/utils/tensor_fragment.py
e3fc9979
Merge branch 'master' into olruwase/tensor_fragments
20984e42
More docs
03d9f3fc
Merge branch 'master' into olruwase/tensor_fragments
3d68746e
Support unsharded fp32 grad
cb009987
Merge branch 'master' into olruwase/tensor_fragments
e9aa4682
Merge branch 'master' into olruwase/tensor_fragments
d45782ac
Remove debug prints
7fa2010c
Merge branch 'olruwase/tensor_fragments' of github.com:microsoft/Deep…
71238a4a
jeffra
approved these changes
on 2023-02-07
Merge branch 'master' into olruwase/tensor_fragments
dd55ec1b
Merge branch 'master' into olruwase/tensor_fragments
b107afc2
Fix off-by-one detection of empty grads
e45c5683
Merge branch 'olruwase/tensor_fragments' of github.com:microsoft/Deep…
94f1716a
Merge branch 'master' into olruwase/tensor_fragments
e5d3b54b
stas00
commented
on 2023-02-09
stas00
commented
on 2023-02-09
stas00
commented
on 2023-02-09
stas00
commented
on 2023-02-09
stas00
commented
on 2023-02-09
Update deepspeed/utils/tensor_fragment.py
e68caab5
Update deepspeed/utils/tensor_fragment.py
7e2cbca9
Update deepspeed/utils/tensor_fragment.py
bb4a8d06
Update deepspeed/runtime/zero/stage3.py
8acd7fb7
Merge branch 'master' into olruwase/tensor_fragments
b8d7b878
Merge branch 'master' into olruwase/tensor_fragments
5f96ddd7
Merge branch 'master' into olruwase/tensor_fragments
1303bdd5
Merge branch 'master' into olruwase/tensor_fragments
8a9c512d
Merge branch 'master' into olruwase/tensor_fragments
d3e138ed
Merge branch 'master' into olruwase/tensor_fragments
87797432
Merge branch 'master' into olruwase/tensor_fragments
4fb6b0c4
Merge branch 'master' into olruwase/tensor_fragments
0895e3c1
Merge branch 'master' into olruwase/tensor_fragments
137e325e
Merge branch 'master' into olruwase/tensor_fragments
97586dcf
Fix off-by-one error
4d1c9920
Merge branch 'master' into olruwase/tensor_fragments
1b6003bd
Skip ranks with no gradient data
0d3c8da9
Formatting
3d40f64a
Merge branch 'olruwase/tensor_fragments' of github.com:microsoft/Deep…
6ea17899
Merge branch 'master' into olruwase/tensor_fragments
1d9174c7
Merge branch 'master' into olruwase/tensor_fragments
f1efeca1
Merge branch 'master' into olruwase/tensor_fragments
a56da9e8
Merge branch 'master' into olruwase/tensor_fragments
9bfee830
Merge branch 'master' into olruwase/tensor_fragments
5b790733
Add license
4e87e1bb
Fix license
84450f3a
tjruwase
merged
541e423a
into master 2 years ago
mrwyattii
deleted the olruwase/tensor_fragments branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub