FSDP orchestration: apply + loading/saving #46990
Add FSDP orchestration: mesh init, distribute-before-load, and DCP save.
7c113940
Merge branch 'split/a-pr-3-dual-path-loading' into split/a-pr-4-fsdp-…
17c6d402
add fsdp plan to 2 models for now
00eb1166
add tests fsdp mixin
be296ddf
linting
05900d60
refactor test fsdp mixin
fc2423bd
test fsdp mixin cleaning
5bbd8201
remove fsdp policy in tests + trim down further
b6d0b67a
test fsdp clean
ea361231
restore test_modeling_utils
bec4d23a
linting
8d3d3298
start trim down stuff
6316ee1e
fix
6e9004ec
3outeille
marked this pull request as draft 1 day ago
breaking: cleaning modeling_utils.py
68df491a
load path with fsdp (dtensor) and tp (old tp) is linked
16b0b291
linting
e976a44c
add saving
a2fb155f
styling
5f52f196
fix tp ci
7f543017
add fsdp to ci
99f79acc
linting
b4514906
pick one model only for this PR
54ff4d1d
restore
33995399
trigger fsdp ci
11cf79c2
doc cleaning + tp_size remove
5b7ac3e5
fix tp ci for ep
06b0c394
edit doc
7a94d77a
3outeille
changed the title FSDP orchestration: mesh init, distribute-before-load, DCP save FSDP orchestration: apply + loading/saving 2 hours ago
move distributed function to utils + guarding
f3c742b7
linting
37df13ee
3outeille
marked this pull request as ready for review 1 hour ago
Merge branch 'split/a-pr-3-dual-path-loading' into split/a-pr-4-fsdp-…
86875d22
Merge branch 'split/a-pr-3-dual-path-loading' into split/a-pr-4-fsdp-…
450579ba
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub