pytorch
0e666a9f - [TensorExpr] Cache use of fallback in kernel invocation (#47812)

Commit View On GitHub

Commit

3 years ago

[TensorExpr] Cache use of fallback in kernel invocation (#47812) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47812 Previously we were checking the environment every kernel invocation for `tensorExprFuserEnabled`, which checks the environment for `PYTORCH_TENSOREXPR`. This is only a dev-exposed API, so I think it is fine to only check once when the kernel is initialized. The `disable_optimization` flag which is user-exposed more or less covers the same functionality. For fun, some benchmarking. I compared scripted before and after of ``` def foo(x, y): return x + y ``` for x, y = torch.tensor([1]). I also removed the prim::TypeCheck node to better isolate the kernel (I cheated). Here is gist: https://gist.github.com/eellison/39f3bc368f5bd1f25ded4827feecd15e Without Changes Run 1: no fusion: sum 6.416894399004377 min: 0.6101883250012179 median 0.6412974080012646 with fusion: sum 6.437897570998757 min: 0.6350401220006461 median 0.6446951820034883 Without Changes Run2: no fusion: sum 6.601341788002173 min: 0.6292048720024468 median 0.6642187059987918 with fusion: sum 6.734651455997664 min: 0.6365462899993872 median 0.6755226659988693 With Changes Run1: no fusion: sum 6.097717430002376 min: 0.5977709550024883 median 0.613631643998815 with fusion: sum 6.1299369639964425 min: 0.5857932209983119 median 0.6159247440009494 With Changes Run2: no fusion: sum 6.5672018059995025 min: 0.6245676209982776 median 0.6386050750006689 with fusion: sum 6.489086147994385 min: 0.6236886289989343 median 0.6535737619997235 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D25286210 fbshipit-source-id: a18b4918a7f7bed8a39112ae04b678e79026d39b

Author

Elias Ellison

Committer

facebook-github-bot

Parents

70853c50

pytorch 0e666a9f - [TensorExpr] Cache use of fallback in kernel invocation (#47812)

Commit

pytorch
0e666a9f - [TensorExpr] Cache use of fallback in kernel invocation (#47812)