pytorch
7b0f867c - Perf improvement of Conv2d and Conv3d (#40324)

Commit
4 years ago
Perf improvement of Conv2d and Conv3d (#40324) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40324 1) avoid the use of item 2) bypass the im2col for 1x1 conv Test Plan: unit test and perf benchmark to show improvement ``` num = 50 N = 1 C = 512 H = 4 W = 4 M = 512 kernel_h = 1 kernel_w = 1 stride_h = 1 stride_w = 1 padding_h = 0 padding_w = 0 X_np = np.random.randn(N, C, H, W).astype(np.float32) W_np = np.random.randn(M, C, kernel_h, kernel_w).astype(np.float32) X = torch.from_numpy(X_np) conv2d_pt = torch.nn.Conv2d( C, M, (kernel_h, kernel_w), stride=(stride_h, stride_w), padding=(padding_h, padding_w), groups=1, bias=True) class ConvNet(torch.nn.Module): def __init__(self): super(ConvNet, self).__init__() self.conv2d = conv2d_pt def forward(self, x): return self.conv2d(x) model = ConvNet() def pt_forward(): # with torch.autograd.profiler.profile(record_shapes=True) as prof: model(X) # print(prof.key_averages().table(sort_by="self_cpu_time_total")) torch._C._set_mkldnn_enabled(False) t = Timer("pt_forward()", "from __main__ import pt_forward, X") ``` Before the optimization: pt time = 5.841153813526034 After the optimization: pt time = 4.513134760782123 Differential Revision: D22149067 fbshipit-source-id: 538d9eea5b729e6c3da79444bde1784bde828876
Author
Parents
Loading