SemanticDiff

pytorch
326dcf9d - Never reuse accumulated gradients' buffers (#119334)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

229 days ago

Never reuse accumulated gradients' buffers (#119334) Since accumulate grad may steal the gradient's `c10::Storage`, we can't reuse the op otherwise the gradient will get overwritten. From benchmarks, using the inductor's codegen'd _empty_strided_cpu/cuda and assigning to it has lower overhead than deep copying the gradient and reusing its buffer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119334 Approved by: https://github.com/jansel ghstack dependencies: #118817

Author

xmfan

xmfan

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading