openvino
ba4782cd - [PT FE] Support scalar (rank-0) input for aten::log_softmax and aten::softmax (#34065)

Commit
42 days ago
[PT FE] Support scalar (rank-0) input for aten::log_softmax and aten::softmax (#34065) ### Details: PyTorch allows log_softmax and softmax on scalar (rank-0) tensors by internally reshaping them to 1D. The WeNet decoder model hits this path — torch.tensor(0.0) is passed through log_softmax with dim=-1, which fails during conversion because OpenVINO core ops reject axis=-1 for rank-0 inputs. This patch adds scalar handling at multiple levels: - PT frontend translators: unsqueeze scalar to rank 1, apply the op with axis=0, squeeze back. This mirrors what PyTorch does internally. - Core op validation (LogSoftmax v5, Softmax v1/v8): skip axis bounds check for rank-0, since the result is mathematically determined (softmax=1, log_softmax=0). - Softmax v8 evaluate: treat rank-0 as a 1-element tensor. - Decomposition passes (LogSoftmax, Softmax): replace rank-0 input with the appropriate constant (0 or 1). - Softmax v8-to-v1 downgrade: replace rank-0 with constant 1 instead of attempting axis normalization. The core op and transformation changes are necessary even though the PT frontend translator already reshapes scalars to 1D. The reason is that OpenVINO's validation and transformation pipeline processes operations at multiple stages, and rank-0 LogSoftmax/Softmax nodes can appear in the graph from paths other than the PT frontend — for example, from ONNX frontend, from direct graph construction via the C++ API, or from subgraph extraction during testing. Additionally, the reference implementations (evaluate methods) are used by the constant folding pass and by the test infrastructure to compute expected outputs. If a rank-0 LogSoftmax/Softmax node reaches constant folding or the interpreter backend without these fixes, it will hit the same axis validation failure. The decomposition passes need the same treatment because they run before plugin- specific lowering and would otherwise crash on the axis normalization step when encountering a rank-0 input. In short, every layer that touches the axis attribute of these ops needs to gracefully handle the rank-0 case to avoid breaking whichever path happens to encounter it first. ### Tickets: - 143262
Author
Parents
Loading