pytorch
065de430 - Fixing a bug where allocating a 4GB block results in using 8GB of memory (#95827)

Commit

1 year ago

Fixing a bug where allocating a 4GB block results in using 8GB of memory (#95827) I added two constants. First helps with avoiding rounding while we hit a certain threshold, and second, to control what blocks can be cached. Allocations larger than `kMaxRoundThreshold` will not be rounded to the next power of two anymore. Generally it is expected that larger allocations happen less frequently, and this more or less matches what happens in `CudaCachingAllocator`. Blocks larger than `kMaxCachedSize` will not be cached. This is a separate problem than the above but I noticed this caching is poorly implemented here and doesn't do anything to avoid fragmentation or to help with good resource utilization. For example, the following allocations: ``` t1 = alloc(4GB) del t1 t2 = alloc(10k) t3 = alloc(4GB) ``` this results in allocating 8GB, because the first 4GB block that is cached gets assigned to the 10k allocation wasting the rest of the block. Lastly, ideally I would make this constants configurable, but looking around the code I didn't see any existing mechanisms in ATen to configure things at runtime. Fixes #95823 Pull Request resolved: https://github.com/pytorch/pytorch/pull/95827 Approved by: https://github.com/ngimel

Author

akamali

Committer

pytorchmergebot

Parents

a87f3f61

pytorch 065de430 - Fixing a bug where allocating a 4GB block results in using 8GB of memory (#95827)

pytorch
065de430 - Fixing a bug where allocating a 4GB block results in using 8GB of memory (#95827)