xla
Experimental TPU implementation of DistributedDataParallel
#4193
Merged

Experimental TPU implementation of DistributedDataParallel #4193

will-cromar merged 10 commits into master from wcromar/pjrt-ddp
will-cromar
will-cromar will-cromar force pushed from e9907a69 to a94a8158 3 years ago
will-cromar will-cromar changed the title [WIP] Experimental TPU implementation of DistributedDataParallel Experimental TPU implementation of DistributedDataParallel 3 years ago
will-cromar will-cromar marked this pull request as ready for review 3 years ago
will-cromar will-cromar requested a review from alanwaketan alanwaketan 3 years ago
will-cromar will-cromar requested a review from JackCaoG JackCaoG 3 years ago
JackCaoG
JackCaoG commented on 2022-11-22
JackCaoG
JackCaoG commented on 2022-11-22
will-cromar Add C++ API to get PJRT process ID
d9476865
will-cromar Add PJRT-compatible DDP implementation
54e335dc
will-cromar Update ImageNet test for PJRT+DDP
2c073034
will-cromar Use new DDP implementation in tests
754b91d8
will-cromar Fix tests
b7717b17
will-cromar Fix XRT test
a1c7b201
will-cromar formatting
e531eb62
will-cromar Make process group init optional.
17a1cabf
will-cromar formatting
162272ba
will-cromar Check for TPU before checking TPU version
0b7fe908
will-cromar will-cromar force pushed from ed03d91a to 0b7fe908 3 years ago
JackCaoG
JackCaoG approved these changes on 2022-11-28
alanwaketan
alanwaketan commented on 2022-11-28
alanwaketan
alanwaketan approved these changes on 2022-11-28
will-cromar will-cromar added runtime
will-cromar will-cromar added ddp
will-cromar
will-cromar will-cromar merged 803052c1 into master 3 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone