Fix incorrect runtime error in mul_() when the tensor layout is Mkldnn (#51758)
Summary:
Calling Mkl-layout's mul_ from C++ API raises a RuntimeError.
Error message is bellow:
```
terminate called after throwing an instance of 'c10::Error'
what(): unsupported tensor layout: Mkldnn
```
Environment
・CPU : Intel(R) Core(TM) i7-8086K CPU @ 4.00GHz
・OS : 18.04.1 LTS
・compiler : gcc 7.5.0
・branch : master
・commit ID: 16cfe97
・build Environment variable: USE_CUDA=0, USE_DISTRIBUTED=0, USE_MKLDNN=1
・Python: 3.6.9
CMakeLists.txt
```
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(mkldnn_test)
find_package(Torch REQUIRED)
add_executable(mkldnn_test mkldnn_test.cpp)
target_link_libraries(mkldnn_test "${TORCH_LIBRARIES}")
set_property(TARGET mkldnn_test PROPERTY CXX_STANDARD 14)
```
mkldnn_test.cpp
```
#include <torch/torch.h>
int main() {
torch::Tensor a = torch::randn({2, 2});
torch::Tensor a_mkl = a.to_mkldnn();
a.mul_(0.5)
a_mkl.mul_(0.5);
std::cout << a << std::endl;
std::cout << a_mkl.to_dense() << std::endl;
return 0;
}
```
Expected Result
```
$ ./mkldnn_test
0.1344 0.8107
-0.8157 -0.2610
[ CPUFloatType{2,2} ]
0.1344 0.8107
-0.8157 -0.2610
[ CPUFloatType{2,2} ]
```
Execution Result
```
$ ./mkldnn_test
terminate called after throwing an instance of 'c10::Error'
what(): unsupported tensor layout: Mkldnn
Exception raised from validate at /home/gtka7311/pytorch_v180/c_api_test/pytorch/aten/src/ATen/TensorIterator.h:128 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f8a1472690b in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libc10.so)
frame https://github.com/pytorch/pytorch/issues/1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xce (0x7f8a1472316e in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libc10.so)
frame https://github.com/pytorch/pytorch/issues/2: <unknown function> + 0x965bc3 (0x7f8a0d07dbc3 in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/3: at::TensorIteratorBase::populate_operands(at::TensorIteratorConfig&) + 0xf1 (0x7f8a0d079ee1 in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/4: at::TensorIteratorBase::build(at::TensorIteratorConfig&) + 0x3b (0x7f8a0d07ad3b in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/5: at::TensorIteratorBase::build_binary_op(at::Tensor const&, at::Tensor const&, at::Tensor const&) + 0x129 (0x7f8a0d07b339 in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/6: at::TensorIterator::binary_op(at::Tensor&, at::Tensor const&, at::Tensor const&) + 0x38 (0x7f8a0d07b418 in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/7: at::native::mul_out(at::Tensor&, at::Tensor const&, at::Tensor const&) + 0x33 (0x7f8a0d217793 in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/8: at::native::mul_(at::Tensor&, c10::Scalar) + 0x45 (0x7f8a0d217865 in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/9: <unknown function> + 0x1435c21 (0x7f8a0db4dc21 in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/10: at::Tensor& c10::Dispatcher::call<at::Tensor&, at::Tensor&, c10::Scalar>(c10::TypedOperatorHandle<at::Tensor& (at::Tensor&, c10::Scalar)> const&, at::Tensor&, c10::Scalar) const + 0x15c (0x7f8a0d9e482c in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/11: <unknown function> + 0x2a86269 (0x7f8a0f19e269 in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/12: at::Tensor& c10::Dispatcher::call<at::Tensor&, at::Tensor&, c10::Scalar>(c10::TypedOperatorHandle<at::Tensor& (at::Tensor&, c10::Scalar)> const&, at::Tensor&, c10::Scalar) const + 0x15c (0x7f8a0d9e482c in /home/gtka7311/pytorch_v180/c_api_test/pytorch/torch/lib/libtorch_cpu.so)
frame https://github.com/pytorch/pytorch/issues/13: main + 0xfd (0x5653221cd282 in ./mkldnn_test)
frame https://github.com/pytorch/pytorch/issues/14: __libc_start_main + 0xe7 (0x7f8a0bba5b97 in /lib/x86_64-linux-gnu/libc.so.6)
frame https://github.com/pytorch/pytorch/issues/15: _start + 0x2a (0x5653221ccf2a in ./mkldnn_test)
```
Modification policy for the code
Generally ``mul_`` is processed by ``TensorIterator`` of ``mul_out``.
However, ``TensorIterator`` does not support ``Mkl-Layout tensor``.
Therefore, to solve this problem, modified ``aten/src/ATen/native/BinaryOps.cpp`` so that ``mkldnn_mul_out`` would be executed if ``Mkl-Layout tensor`` is inputed in ``mul_out``.
The modifications of the code are as follows:
```
diff --git a/aten/src/ATen/native/BinaryOps.cpp b/aten/src/ATen/native/BinaryOps.cpp
index ee55114285..5c403546f2 100644
--- a/aten/src/ATen/native/BinaryOps.cpp
+++ b/aten/src/ATen/native/BinaryOps.cpp
@@ -270,6 +270,9 @@ Tensor& floor_divide_(Tensor& self, const Tensor& other) {
}
Tensor& mul_out(Tensor& result, const Tensor& self, const Tensor& other) {
+ if (self.is_mkldnn()) {
+ return native::mkldnn_mul_out(result, self, other);
+ }
auto iter = TensorIterator::binary_op(result, self, other);
mul_stub(iter.device_type(), iter);
return result;
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51758
Reviewed By: pbelevich
Differential Revision: D26655442
Pulled By: bdhirsh
fbshipit-source-id: fcc5e74734cae91f725fab525f181b3066eafa28