SemanticDiff

pytorch
4add06eb - [CUDNN][CUDNN V8 API] LRU Cache for cuDNN frontend `ExecutionPlan` (#104369)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

1 year ago

[CUDNN][CUDNN V8 API] LRU Cache for cuDNN frontend `ExecutionPlan` (#104369) Adds LRU functionality to the cuDNN frontend `ExecutionPlan` cache to address high memory usage as observed in #98688, #104122 via the `TORCH_CUDNN_V8_LRU_CACHE_LIMIT` environment variable. By default this limit is set to 10000, which corresponds to about 2GiB of host memory usage as observed empirically. Note that we are still following up with cuDNN to see if the size of an `ExecutionPlan` can be reduced, as it appears to currently be around 200KiB (!!) for a single plan. This implementation is a bit heavy on the internal asserts for now as it's a bit difficult to directly test the state of the cache without instrumenting it explicitly in tests. Once we are confident that the implementation is stable, we can remove the asserts. CC @malfet who @ptrblck mentioned may have also been looking into this CC @colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/104369 Approved by: https://github.com/malfet

Author

eqy

eqy

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading