SemanticDiff

pytorch
5ed7cd00 - Allow drop_last option in DistributedSampler (#41171)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

4 years ago

Allow drop_last option in DistributedSampler (#41171) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41171 DistributedSampler allows data to be split evenly across workers in DDP, but it has always added additional samples in order for the data to be evenly split in the case that the # of samples is not evenly divisible by the number of workers. This can cause issues such as when doing distributed validation accuracy, where multiple samples could be considered twice. This PR adds a drop_last option where the tail of the data is dropped such that the effective dataset size is still evenly divisible across the workers. This ensures that DDP can train fine (there is no uneven inputs) and each replica gets an equal number of data indices. ghstack-source-id: 108617516 Test Plan: Added unittest Reviewed By: mrshenli Differential Revision: D22449974 fbshipit-source-id: e3156b751f5262cc66437b9191818b78aee8ddea

Author

rohan-varma

rohan-varma

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading