pytorch
3623cfb7 - [FSDP] Speed up first iter order check (part 2) (#96220)

Commit
1 year ago
[FSDP] Speed up first iter order check (part 2) (#96220) For a tensor on GPU, moving it once to CPU and operating on it on CPU is faster than moving it element by element from CPU to GPU. This is a follow-up to also move `world_indices`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96220 Approved by: https://github.com/zhaojuanmao
Author
Committer
Parents
Loading