pytorch
7d89c3b0 - Prefer contiguous output from mkldnn_bf16_gemm (#82968)

Commit

2 years ago

Prefer contiguous output from mkldnn_bf16_gemm (#82968) In https://github.com/pytorch/pytorch/pull/65840#issuecomment-1207843020 it was reported that `mkldnn_bf16_gemm` resulted in extra reorder calls. This seems to be due to the fortran-contiguous strides on the output tensor. Rearranging the matmul operation to output to c-contiguous strides removes these extra reorders. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82968 Approved by: https://github.com/malfet

Author

peterbell10

Committer

pytorchmergebot

Parents

66396777

pytorch 7d89c3b0 - Prefer contiguous output from mkldnn_bf16_gemm (#82968)

pytorch
7d89c3b0 - Prefer contiguous output from mkldnn_bf16_gemm (#82968)