pytorch
fce6d6b3 - Redefine the simdlen semantic: (#88482)

Commit

2 years ago

Redefine the simdlen semantic: (#88482) This PR is targeting to automatically enable vectorization optimization for TorchInductor. It refined the semantics of `config.cpp.simdlen`. Originally, `None` means to disable vectorization while a specific value means the number of elements to be vectorized once time. But it depends on the data. Regarding 256bit SVE/SIMD ISA for ARM and X86, the `simdlen` should be 16 for Float while 32 for BFloat. Hence, this PR defined the `simdlen` as the bit width. The detailed semantics are as follows. - **_simdlen = None_**: Automatically determine the SIMD bit width. Detect HW information and pick the proper vectorization ISA. Specific for X86, the priority of AVX512 is higher than AVX2. - **_simdlen <=1_**: Explicitly disable SIMD - **_simdlen > 1_**: Explicitly specify the SIMD bit width. It equals the disabled semantic if the bit width does not match the ISA width. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88482 Approved by: https://github.com/jgong5, https://github.com/jansel

Author

EikanWang

Committer

pytorchmergebot

Parents

c3acb9c8

pytorch fce6d6b3 - Redefine the simdlen semantic: (#88482)

pytorch
fce6d6b3 - Redefine the simdlen semantic: (#88482)