pytorch
d5021186 - Use SLEEF functions for NEON vectors on macOS ARM64 (#70354)

Commit View On GitHub

Commit

2 years ago

Use SLEEF functions for NEON vectors on macOS ARM64 (#70354) Summary: We noticed that on M1 Macs Tranformer network profiles are dominated by scalar `exp` and `erff` functions (for softmax and GELU). The NEON `Vectorized<float>` implementation does not use SLEEF functions in order to compile on mobile platforms. However, SLEEF is already compiled on macOS ARM64 and is safe to use there. This change adds another implementation of `Vectorized<float>` that uses SLEEF functions. This implementation is only used on macOS ARM64. This change speeds up e.g. prediction of spaCy transformer models by 20% on M1 Macs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70354 Reviewed By: albanD Differential Revision: D33659540 Pulled By: kimishpatel fbshipit-source-id: b8f02a61321873fc60778190a005c466c7d0cc0c (cherry picked from commit 71286a207cefaae5a0be4eb3d618b55366ee4861)

References

#72894 - Merge pytorch master into lazy_tensor_staging

Author

danieldk

Committer

pytorchmergebot

Parents

f0f49a11

pytorch d5021186 - Use SLEEF functions for NEON vectors on macOS ARM64 (#70354)

Commit

pytorch
d5021186 - Use SLEEF functions for NEON vectors on macOS ARM64 (#70354)