pytorch
15a284b0 - optimize softmax backward and logsoftmax backward (#80114)

Commit View On GitHub

Commit

2 years ago

optimize softmax backward and logsoftmax backward (#80114) Currently, if we run softmax_backward/logsoftmax_backward which are not along the last dim, the calculation will fall to a [scalar version](https://github.com/pytorch/pytorch/blob/32593ef2dd26e32ed44d3c03d3f5de4a42eb149a/aten/src/ATen/native/SoftMax.cpp#L220-L287). And we find actually we have the chance to vectorize the calculation along the inner_size dim. Changes we made: Use vectorized softmax_backward_kernel/log_softmax_backward_kernel instead of host_softmax_backward when not along the last dim. We collected the benchmark data of softmax_backward and logsoftmax_backward for BFloat16 and Float32 data type by using the operator_benchmark tool of PyTorch on the platform of Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz. Number of cores: 24 cores(1 socket) [softmax_benchmark_32593ef.log](https://github.com/pytorch/pytorch/files/8962956/softmax_benchmark_32593ef.log) [softmax_benchmark_the_pr.log](https://github.com/pytorch/pytorch/files/8962958/softmax_benchmark_the_pr.log) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80114 Approved by: https://github.com/frank-wei

Author

jiayisunx

Committer

pytorchmergebot

Parents

916a5651

pytorch 15a284b0 - optimize softmax backward and logsoftmax backward (#80114)

Commit

pytorch
15a284b0 - optimize softmax backward and logsoftmax backward (#80114)