pytorch
905cd673 - [DDP Comm Hook] Re-enable the optimization of fusing copy and division when no comm hook is specified (#61379)

Commit
3 years ago
[DDP Comm Hook] Re-enable the optimization of fusing copy and division when no comm hook is specified (#61379) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61379 The optimization was accidentally removed in https://github.com/pytorch/pytorch/pull/59574 This optimization can help save a scan over all the input parameters, by fusing copy and div operations. Now the default temporary hook is allreduce by sum, and no extra division is done inside the hook. ghstack-source-id: 133288529 Test Plan: buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_accumulate_gradients_no_sync buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_ddp_grad_div_uneven_inputs buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_fp16 buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_fp16_grad_is_view buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_DistributedDataParallel_non_default_stream buck test mode/dev-nosan caffe2/test/distributed:c10d -- test_sparse_gradient buck test mode/dev-nosan caffe2/test/distributed:c10 -- test_ddp_checkpointing_once buck test mode/dev-nosan caffe2/test/distributed:c10 -- test_ddp_checkpointing_twice Reviewed By: rohan-varma Differential Revision: D29597614 fbshipit-source-id: 2434e4fd4e6abad7871cfe47886fe97b6e4ba28f
Author
Yi Wang
Parents
Loading