pytorch
68d690ff - Vectorize the softmax calculation when not along the last dim (#59195)

Commit
3 years ago
Vectorize the softmax calculation when not along the last dim (#59195) Summary: Currently, if we do softmax which are not along the last dim, the calculation will fall to a [scalar version](https://github.com/pytorch/pytorch/blob/d417a094f398f1c4efd7f818b14b8471a597fbcc/aten/src/ATen/native/SoftMax.cpp#L14-L64). And we find actually we have the chance to vectorize the calculation along the inner_size dim. Changes we made: - Use vectorized softmax_kernel instead of host_softmax when not along the last dim. Performance data on 28 cores' Intel 8280 CPU when the Input size is [32, 81, 15130] and do softmax along the second dim(81). - FP32 Baseline: 24.67 ms - FP32 optimized: 9.2 ms Pull Request resolved: https://github.com/pytorch/pytorch/pull/59195 Reviewed By: ailzhang Differential Revision: D28854796 Pulled By: cpuhrsch fbshipit-source-id: 18477acc3963754c59009b1794f080496ae16c3d
Parents
Loading