[ATen][quant] Use expect_contiguous in quantized::linear fbgemm version (#58221)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58221
- Use expect_contiguous to avoid Tensor refcount bumps if input tensor is already contiguous
- Use Tensor::sizes()[i] in place of Tensor::size(i) which goes through the dispatcher
- Use at::Dimvector in place of std::vector to avoid heap allocation
Since the qnnpack version needs on device testing, I'll skip that one for now.
Test Plan: CI
Reviewed By: swolchok
Differential Revision: D28406942
fbshipit-source-id: 3c1bdfd1c859fe71869d4daec22158be5c2719d4