hardswish: remove unnecessary quantize call (#36980)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36980
Missed this on the original diff, fixing. Create the output tensor directly instead of quantizing it.
Test Plan:
tests still pass
microbenchmarks show a 2x performance improvment for int8:
https://gist.github.com/vkuzo/3b321b428e4c38e805000961c263286b (this
will depend on input size)
Imported from OSS
Differential Revision: D21185970
fbshipit-source-id: 5b9e93d9f9ac05a8120532bd03ad347541a132c2