SemanticDiff

pytorch
f18f0c70 - Dont clone unmutated args in triton autotuning (#89519)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

1 year ago

Dont clone unmutated args in triton autotuning (#89519) Improves first memory compression on pytorch struct from .55 -> .73. However, it doesn't totally eliminate the overhead from autotuning. Any other pointers on where the overhead is coming from in autotuning would be great. Edit: i think it's just the triton cache clearing https://github.com/openai/triton/blob/44f577984d28ee979f704e2c28a1dcbac9639840/python/triton/testing.py#L159 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89519 Approved by: https://github.com/ngimel, https://github.com/jansel

Author

eellison

eellison

Committer

pytorchmergebot

pytorchmergebot

Parents

FAQ Terms Privacy Refunds Impressum

Loading