SemanticDiff

pytorch
439afda0 - [Gradient Compression] Fix warm-start for PowerSGD laywerwise compression (#50283)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

3 years ago

[Gradient Compression] Fix warm-start for PowerSGD laywerwise compression (#50283) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50283 Realize that for the layerwise compression, the previous warm-start implementation only skips memory allocations, but does not skip filling random values for Qs. Also fix the unit test in distributed_test.py. Previously the process group was not created correctly, and not communication occurred in the test_DistributedDataParallel_powerSGD_ddp_comm_hook. Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202 ghstack-source-id: 120101220 Test Plan: Verified the fix by adding added some loggings locally. Also verified no NE diff on Ads 1x. Reviewed By: rohan-varma Differential Revision: D25846222 fbshipit-source-id: 1ebeeb55ceba64d4d904ea6ac1bb42b1b2241520

Author

Yi Wang

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading