pytorch
7d89c3b0 - Prefer contiguous output from mkldnn_bf16_gemm (#82968)

Commit
2 years ago
Prefer contiguous output from mkldnn_bf16_gemm (#82968) In https://github.com/pytorch/pytorch/pull/65840#issuecomment-1207843020 it was reported that `mkldnn_bf16_gemm` resulted in extra reorder calls. This seems to be due to the fortran-contiguous strides on the output tensor. Rearranging the matmul operation to output to c-contiguous strides removes these extra reorders. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82968 Approved by: https://github.com/malfet
Author
Committer
Parents
Loading