Dynamic dispatch for optimized quantized op kernels (#25545)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25545
This re-uses the infrastructure from ATen/native/cpu, which compiles kernels multiple times for different instruction sets and dispatches dynamically based on the CPU's capability flags at runtime. This ensures we use the most optimal quantized kernel for the given machine
Test Plan: Imported from OSS
Differential Revision: D17166369
Pulled By: jamesr66a
fbshipit-source-id: 8c8393f99365e1408819bbaf254c1b5734a34b70