Fix anomaly mode memory leak (#51610)

Commit

3 years ago

Fix anomaly mode memory leak (#51610) Summary: Fixes https://github.com/pytorch/pytorch/issues/51349 The memory leak happens when 1) `create_graph` is True AND 2) detect anomaly mode is on. When a backward node's constructor is called during backward, the current evaluating node is assigned as a "parent" of the created node. The code that assigns the parent encounters the below issue: `functionToPyObject(parent_node)` returns a new PyObject (with refcount 1) or if PyObject already exists, increments its refcount by 1. However [PyDict_SetItem](https://github.com/python/cpython/blob/1b55b65638254aa78b005fbf0b71fb02499f1852/Objects/dictobject.c#L1532) calls into [insertdict](https://github.com/python/cpython/blob/v3.8.1/Objects/dictobject.c#L1034) which increments refcount again. This means that when dict is destroyed, the refcount of the PyObject is at least one. This keeps `parent_node` (the backward function) alive, which then keeps the saved tensor alive. Similar calls in the codebase to `functionToPyObject` won't require Py_DECREF if it is then passed into a tuple (instead of dict), because the analogous PyTuple_SetItem call does not increment refcount. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51610 Reviewed By: albanD Differential Revision: D26240336 Pulled By: soulitzer fbshipit-source-id: 2854528f66fab9dbce448f8a7ba732ce386a7310

Author

soulitzer

Committer

facebook-github-bot

Parents

0222966e

pytorch 2e8e560c - Fix anomaly mode memory leak (#51610)

pytorch
2e8e560c - Fix anomaly mode memory leak (#51610)