pytorch
32eb3ab7 - [FSDP] Speed up first iter order check (#96146)

Commit
1 year ago
[FSDP] Speed up first iter order check (#96146) For a tensor on GPU, moving it once to CPU and operating on it on CPU is faster than moving it element by element from CPU to GPU. The relevant tensor in this case is `world_num_valid_indices`. This closes https://github.com/pytorch/pytorch/issues/95728. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96146 Approved by: https://github.com/zhaojuanmao, https://github.com/rohan-varma
Author
Committer
Parents
Loading