pytorch
d3ffe9ab - [PyTorch] Allocate correctly-sized output tensor in addmm_cuda (#56033)

Commit
3 years ago
[PyTorch] Allocate correctly-sized output tensor in addmm_cuda (#56033) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56033 There doesn't seem to be any reason not to size the output correctly, and it avoids a round of dispatch for resize. ghstack-source-id: 127409715 Test Plan: Inspected GPU trace for simple nn.Linear in a loop. No more resize operator invocation. Existing CI should let us know if this is incorrect Reviewed By: ngimel Differential Revision: D27768311 fbshipit-source-id: fb48ec50f3cffc1015ef03d528e9007274b4dd3a
Author
Parents
Loading