pytorch
bc716383 - Redefine the simdlen semantic (#89263)

Commit
2 years ago
Redefine the simdlen semantic (#89263) This PR is targeting to automatically enable vectorization optimization for TorchInductor. It refined the semantics of `config.cpp.simdlen`. Originally, `None` means to disable vectorization while a specific value means the number of elements to be vectorized once time. But it depends on the data. Regarding 256bit SVE/SIMD ISA for ARM and X86, the `simdlen` should be 16 for Float while 32 for BFloat. Hence, this PR defined the `simdlen` as the bit width. The detailed semantics are as follows. - **_simdlen = None_**: Automatically determine the SIMD bit width. Detect HW information and pick the proper vectorization ISA. Specific for X86, the priority of AVX512 is higher than AVX2. - **_simdlen <=1_**: Explicitly disable SIMD - **_simdlen > 1_**: Explicitly specify the SIMD bit width. It equals the disabled semantic if the bit width does not match the ISA width. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89263 Approved by: https://github.com/jgong5, https://github.com/jansel
Author
Committer
Parents
Loading