SemanticDiff

pytorch
c348faed - [Gradient Compression] Warm-start of PowerSGD (#49451)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

3 years ago

[Gradient Compression] Warm-start of PowerSGD (#49451) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49451 Reuse the low-rank tensors P(s) and Q(s) from the previous iteration if possible. This can give a better compression performance in terms of both accuracy and speed. Also add a unit test for batched PowerSGD to test_c10d.py. Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202 ghstack-source-id: 119014132 Test Plan: buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_powerSGD_ddp_comm_hook_nccl buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_DistributedDataParallel_powerSGD_ddp_comm_hook Reviewed By: rohan-varma Differential Revision: D25583086 fbshipit-source-id: a757df3c4cfcc0ead4647f7de2f43198f1e063ee

Author

Yi Wang

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading