SemanticDiff

pytorch
65f54bc0 - [SR] Optimize VarStack (#68750)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

2 years ago

[SR] Optimize VarStack (#68750) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68750 There was some room for optimization in static runtime's `prim::VarStack`: * Avoid refcount bumps - constructing the `std::vector<at::Tensor>` can be avoided by writing a custom version of `stack_out` that takes a `std::vector<at::Tensor*>` * Skip the memory overlap check * Avoid device dispatcher overhead in a few places (e.g. `tensor.unsqueeze -> at::native::unsqueeze`) Test Plan: `buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest -- Stack` Reviewed By: swolchok Differential Revision: D32596934 fbshipit-source-id: e8f0ccea37c48924cb4fccbfdac4e1e11da95ee0

References

#70343 - Merge master into lazy_tensor_staging

Author

Mike Iovine

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading