SemanticDiff

pytorch
578507cb - Fix nanmedian result using more CUDA memory than necessary (#68591)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

2 years ago

Fix nanmedian result using more CUDA memory than necessary (#68591) Summary: CUDA's `at::nanmedian` creates a sorted copy of the array, then indexes into it to create a single element view. This view necessarily keeps the entire `sorted` tensor's storage alive which can be avoided by returning a copy, which is what `at::median` does indirectly via `at::where`. This also changes the index variable `k` to be a simple `int64_t` instead of the CUDA tensor that was used before. This saves the additional host and device operations from calling `Tensor`'s `operator -` which helps balance out the cost of the `clone` added here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68591 Reviewed By: dagitses Differential Revision: D32538538 Pulled By: ngimel fbshipit-source-id: abe9888f80cf9d24d50a83da756e649af1f6ea3b

Author

peterbell10

peterbell10

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading