pytorch-lightning
Fix ModelParallelStrategy fails with non-distributed checkpoint.
#21384
Open

Fix ModelParallelStrategy fails with non-distributed checkpoint. #21384

littlebullGit
github-actions github-actions added pl
littlebullGit littlebullGit marked this pull request as ready for review 22 days ago
littlebullGit littlebullGit requested a review from lantiga lantiga 22 days ago
littlebullGit littlebullGit requested a review from tchaton tchaton 22 days ago
littlebullGit littlebullGit requested a review from justusschock justusschock 22 days ago
littlebullGit littlebullGit requested a review from ethanwharris ethanwharris 22 days ago
littlebullGit littlebullGit force pushed from 82f9a7d8 to db3d718a 22 days ago
littlebullGit Add regression test for ModelParallel single-file checkpoint
d4e476fa
littlebullGit littlebullGit force pushed from db3d718a to d4e476fa 22 days ago
littlebullGit littlebullGit changed the title Add regression test for ModelParallel single-file checkpoint CUDA test to reproduce: ModelParallelStrategy fails with non-distributed checkpoint. #21357 22 days ago
codecov
bhimrazy bhimrazy marked this pull request as draft 21 days ago
littlebullGit littlebullGit changed the title CUDA test to reproduce: ModelParallelStrategy fails with non-distributed checkpoint. #21357 Fix ModelParallelStrategy fails with non-distributed checkpoint. #21384 21 days ago
littlebullGit littlebullGit marked this pull request as ready for review 21 days ago
littlebullGit littlebullGit changed the title Fix ModelParallelStrategy fails with non-distributed checkpoint. #21384 Fix ModelParallelStrategy fails with non-distributed checkpoint. 21 days ago
littlebullGit littlebullGit force pushed from 42a99172 to 31b09760 21 days ago
littlebullGit littlebullGit force pushed from 31b09760 to 646e01bc 21 days ago
littlebullGit Fix ModelParallel single-file checkpoint with compiled modules
cd663c61
littlebullGit littlebullGit force pushed from 90af5cbe to cd663c61 20 days ago
github-actions github-actions added has conflicts
littlebullGit Merge branch 'master' into fix/21357-modelparallel-checkpoint
2e5ae5f4
github-actions github-actions removed has conflicts
SkafteNicki SkafteNicki added distributed
SkafteNicki SkafteNicki added torch.compile
SkafteNicki
SkafteNicki approved these changes on 2025-12-01
github-actions github-actions added has conflicts
littlebullGit Merge branch 'master' into fix/21357-modelparallel-checkpoint
5eca13e4
github-actions github-actions removed has conflicts
littlebullGit Merge branch 'master' into fix/21357-modelparallel-checkpoint
ffcd6e41
github-actions github-actions added has conflicts
littlebullGit Merge branch 'master' into fix/21357-modelparallel-checkpoint
501f0213
github-actions github-actions removed has conflicts
littlebullGit Merge branch 'master' into fix/21357-modelparallel-checkpoint
457260af

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone