pytorch
26301447 - Call to mkldnn_matmul from aten::addmm on AArch64 (#91763)

Commit
2 years ago
Call to mkldnn_matmul from aten::addmm on AArch64 (#91763) We have noticed that on BERT_pytorch in torchbenchmark majority of time is spent in running GEMM in aten:addmm. At the moment this calls into BLAS routine, but on AArch64 it will be faster if it calls into mkldnn_matmul. Performance wise compared to build with OpenBLAS it runs faster 1.2x faster on 16 cores with batch size of 8 on Graviton3, while if fast math mode (mkldnn_matmul exposes through oneDNN and Arm Compute Library option to run GEMM with FP32 inputs using BBF16 operations) is enabled then it is 2.3x Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/91763 Approved by: https://github.com/jgong5, https://github.com/ngimel, https://github.com/malfet
Author
Committer
Parents
  • File
    BUILD.bazel
  • aten/src/ATen
    • File
      Config.h.in
    • native
      • File
        LinearAlgebra.cpp
      • mkldnn
        • File
          Matmul.cpp
    • test
      • File
        verify_api_visibility.cpp
  • File
    buckbuild.bzl
  • cmake
    • File
      Dependencies.cmake