pytorch
68816e4f - Remove inplace buffers when original and mutation are both removed (#102289)

Commit

1 year ago

Remove inplace buffers when original and mutation are both removed (#102289) Currently if we have an inplaced buffer that's completely internal to a fused kernel and thus doesn't need to be allocated, we are still allocating it and sending unused argument to a kernel, because our analysis for removing buffers treats it separately (assuming that either original or mutated value are still needed). This PR extends buffer removal to inplaced buffers that can be removed. Generated kernel for e.g. ln changes from ``` def triton_(in_out_ptr0, in_out_ptr1, in_ptr0, in_ptr1, in_ptr2, out_ptr0, out_ptr1, xnumel, rnumel, XBLOCK : tl.constexpr): ``` where in_out_ptr0 is unused in the kernel to ``` def triton_(in_out_ptr1, in_ptr0, in_ptr1, in_ptr2, out_ptr0, out_ptr1, xnumel, rnumel, XBLOCK : tl.constexpr): ``` and corresponding allocation/reuse lines in the wrapper are removed. The `in_out_ptr1` is also mislabeled - it's not `in_out`, it's only written to, but this PR doesn't fix it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102289 Approved by: https://github.com/jansel

Author

Natalia Gimelshein

Committer

pytorchmergebot

Parents

0db704d2

pytorch 68816e4f - Remove inplace buffers when original and mutation are both removed (#102289)

pytorch
68816e4f - Remove inplace buffers when original and mutation are both removed (#102289)