SemanticDiff

pytorch
daf1050a - [dtensor] refactor sharding cost model to count for latency (#119897)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

220 days ago

[dtensor] refactor sharding cost model to count for latency (#119897) This PR refactors the shardeing cost model, to do a more accurate estimation of redistribute cost, including both collective latency and communciation time. The previous cost model does not recale the latency and communciation time, therefore the latency factor is too small to be counted, and in the case of small tensors, multiple collectives is preferred than a single collective, which is wrong. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119897 Approved by: https://github.com/tianyu-l

Author

wanchaol

wanchaol

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading