pytorch
42486963 - Integrate NNC conv2d with fuser (#55213)

Commit
3 years ago
Integrate NNC conv2d with fuser (#55213) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55213 Adds the integration of conv2d with the TE fuser. A few things of interest: - I'm *super* selective of what convs get lowered. Only 3x3 depthwise, because I've benchmarked those to death and I'm pretty sure it's a good change. - I'm allowing single-node "fusion" groups for supported convs. (Maybe this is a sign that conv2d codegen should go through a different path entirely, but it seems to basically work). I'll shared full benchmarkr results once I clean them up a little. To summarize, I tested the following torchvision models containing depthwise convolutions. Results are single-core on a skylake-avx512: mobilenet_v2: 8% improvement mobilenet_v3: 9% improvement mnasnet: 10% improvement shufflenet: 18% improvement Note these are comparing against a baseline with a fast-but-buggy grouped convolution implementation in MKLDNN. So perf results will be better if compared on master, but I'm going to assume the MKLDNN bug will be fixed and re-enabled. Perf results are more complicated when comparing to freezing plus conversion to mkldnn layout; mobilenet v2/v3 are still faster, but mnasnet and shufflenet are not. Landing this doesn't prevent MKLDNN freezing from kicking in though, so there's no harm (although landing mkldnn freezing will regress mobilenet, but cest la vie). ghstack-source-id: 126076112 Test Plan: New unit test, plus torchvision Reviewed By: ZolotukhinM Differential Revision: D27530272 fbshipit-source-id: 92153fad234bc9f1eaa4f7624c543168d1294a87
Author
Parents
Loading