Enable stateless XNNPACK convolutions. (#35790)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35790
The optimal solution to use XNNPACK is to separate operator creation
from execution - also called prepacking the weights. If we have done
our job properly, JIT must have caught and replaced nn.Conv2ds on mobile
with the prepacked versions. Still, if we somehow end up in
_convolution for whatever reason, it is still more efficient to go
through XNNPACK for NHWC tensors, compared to the alternative of
converting NHWC to NCHW and going through NNPACK.
Differential Revision: D20821864
Test Plan: Imported from OSS
Pulled By: AshkanAliabadi
fbshipit-source-id: 2732280c2fd31edcb39658f6530d03331a1a4a75