pytorch
561b5078 - Eliminate device guard in generic dispatch key kernel wrappers (#55131)

Commit
3 years ago
Eliminate device guard in generic dispatch key kernel wrappers (#55131) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55131 Benchmark `zeros_out`: ```python from torch.utils.benchmark import Timer counts = Timer( stmt="""at::zeros_out(t, {1});""", setup="auto t = at::empty({1});", language="cpp", ).collect_callgrind(number=1_000) print(counts) ``` With device guard: ``` <torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.CallgrindStats object at 0x7f834f095ca0> at::zeros_out(t, {1}); setup: auto t = at::empty({1}); All Noisy symbols removed Instructions: 1396022 1396022 Baseline: 0 0 1000 runs per measurement, 1 thread ``` Without device guard: ``` <torch.utils.benchmark.utils.valgrind_wrapper.timer_interface.CallgrindStats object at 0x7f25e48927c0> at::zeros_out(t, {1}); setup: auto t = at::empty({1}); All Noisy symbols removed Instructions: 1296022 1296022 Baseline: 0 0 1000 runs per measurement, 1 thread ``` We see about `7.7%` improvement. ghstack-source-id: 126295368 Test Plan: ``` buck build //caffe2/aten/... buck test mode/dev mode/no-gpu //caffe2/test:torch -- 'caffe2/test:torch - test_msnpu_error (test_torch.TestTorch)' ``` Reviewed By: ezyang Differential Revision: D27496584 fbshipit-source-id: 97f783a809b77b28f77a93096d69b3da9ee69df7
Author
Parents
Loading