onnxruntime
756a5235 - Fix sigmoid transformation in TreeEnsembleClassifier for all-positive weights with LOGISTIC post_transform (#27536)

Commit
2 days ago
Fix sigmoid transformation in TreeEnsembleClassifier for all-positive weights with LOGISTIC post_transform (#27536) ### Description `TreeEnsembleClassifier` with `post_transform=LOGISTIC` was not applying the sigmoid transformation when all tree leaf weights are non-negative. This manifests for binary classifiers where every tree is a single leaf node (no splits), a valid degenerate case produced by XGBoost when training data is too small to learn splits. The following fixes were made: - **`tree_ensemble_aggregator.h` — `_set_score_binary()`**: The `weights_are_all_positive_` field and its associated fast path (cases 0/1, threshold at 0.5) have been removed entirely from the classifier. The classifier now always uses the logit-threshold path (cases 2/3, threshold at 0), which correctly applies sigmoid for `LOGISTIC` post-transform regardless of whether leaf weights are non-negative. - **`ml_common.h` — `write_scores()`**: For cases 0/1, apply `ComputeLogistic` (sigmoid) when `post_transform == LOGISTIC` instead of the raw `[1 - score, score]` output. This is a defense-in-depth fix for other callers such as SVMClassifier. - **`ml_common.h` — `batched_update_scores_inplace()`**: Same fix for cases 0/1 in the batched code path used by SVMClassifier. - **Regression test**: Added `TreeEnsembleClassifierBinaryLogisticAllPositiveWeights` in `tree_ensembler_classifier_test.cc`, covering a single-leaf tree (all-positive weights) with `post_transform=LOGISTIC` for both positive and negative aggregate score cases. ### Motivation and Context When converting XGBoost binary:logistic models to ONNX, trees with no splits (leaf-only) produce only non-negative leaf weights, setting `weights_are_all_positive_ = true`. In this state, `_set_score_binary` assigned `write_additional_scores` to 0 or 1, causing `write_scores` to output `[1 - score, score]` without sigmoid — incorrect for a LOGISTIC post-transform. Trees with real splits (mixed positive/negative weights) set `weights_are_all_positive_ = false`, correctly routing through the sigmoid path. This caused major score mismatches when upgrading from XGBoost 1.7.2 to XGBoost 3 with small training datasets. The root cause has been fully addressed by removing `weights_are_all_positive_` from the tree ensemble classifier code path entirely. The `ml_common.h` changes remain as defense-in-depth fixes for other callers (e.g. SVMClassifier) that may still set `add_second_class` to 0 or 1. <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> ---- *This section details on the original issue you should resolve* <issue_title>TreeEnsembleClassifier with post_transform=LOGISTIC skips sigmoid for leaf-only trees when all weights are non-negative</issue_title> <issue_description>## Describe the issue We have code which converts xgboost models to ONNX. We were seeing major score mismatches when attempting to upgrade from xgboost 1.7.2 to xgboost 3. With the help of AI, I cloned down the source code for the various OS repos involved. AI believes the issue lives in the onnxruntime itself and was the result of our training dataset being too small to produce splits. To work around this, we just increased the size of our test dataset. But I figured it's worth opening a bug report about this edge case. The AI generated bug report is as follows: `TreeEnsembleClassifier` with `post_transform=LOGISTIC` does not apply the sigmoid transformation when **all tree weights are non-negative** (`weights_are_all_positive_ = true`). This happens for binary classifiers where every tree is a single leaf node (no splits), which is a valid degenerate case produced by XGBoost when training data is too small to learn splits. **Expected behavior**: The second output (class scores) should contain post-transformed probabilities: `[1 - sigmoid(agg), sigmoid(agg)]` **Actual behavior**: The second output contains raw scores without sigmoid: `[1 - agg, agg]` The bug is in the interaction between `_set_score_binary` in `tree_ensemble_aggregator.h` and `write_scores` in `ml_common.h`. When `weights_are_all_positive_ == true`, `_set_score_binary` sets `add_second_class` to 0 or 1. The `write_scores` function only applies LOGISTIC (sigmoid) for `add_second_class` values 2 and 3 — values 0 and 1 output `[1-score, score]` without sigmoid. Trees with real splits (which produce both positive and negative weights) correctly set `weights_are_all_positive_ = false`, causing `add_second_class = 2 or 3`, and the sigmoid IS applied. So the bug only manifests for leaf-only trees. ### Source code references 1. **`tree_ensemble_aggregator.h` — `_set_score_binary()`**: When `weights_are_all_positive_ == true`, overwrites `write_additional_scores` to 0 or 1: ```cpp if (weights_are_all_positive_) { if (pos_weight > 0.5) { write_additional_scores = 0; // <-- bypasses sigmoid return class_labels_[1]; } else { write_additional_scores = 1; // <-- bypasses sigmoid return class_labels_[0]; } } ``` 2. **`ml_common.h` — `write_scores()`**: Only applies LOGISTIC for `add_second_class` 2 and 3: ```cpp switch (add_second_class) { case 0: case 1: // Raw score output — NO sigmoid applied scores.push_back(scores[0]); scores[0] = 1 - scores[0]; break; case 2: case 3: if (post_transform == POST_EVAL_TRANSFORM::LOGISTIC) { // Sigmoid IS applied here scores[1] = ComputeLogistic(scores[0]); scores[0] = ComputeLogistic(-scores[0]); } break; } ``` 3. **`tree_ensemble_common.h`**: `weights_are_all_positive_` is set to `true` when all `class_weights` values are non-negative. Leaf-only XGBoost binary:logistic trees always have non-negative weights. ## To reproduce Minimal self-contained reproducer: ```python import numpy as np import onnxruntime from onnx import TensorProto, helper print(f"onnxruntime version: {onnxruntime.__version__}") X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [None, 3]) label_out = helper.make_tensor_value_info("label", TensorProto.INT64, [None]) prob_out = helper.make_tensor_value_info("probs", TensorProto.FLOAT, [None, 2]) def make_model(nodes_modes, nodes_values, nodes_truenodeids, nodes_falsenodeids, class_treeids, class_nodeids, class_weights, **node_kwargs): """Build a minimal TreeEnsembleClassifier ONNX model.""" n_nodes = len(nodes_modes) node = helper.make_node( "TreeEnsembleClassifier", inputs=["X"], outputs=["label", "probs"], domain="ai.onnx.ml", nodes_treeids=[0] * n_nodes, nodes_nodeids=list(range(n_nodes)), nodes_featureids=[0] * n_nodes, nodes_values=nodes_values, nodes_modes=nodes_modes, nodes_truenodeids=nodes_truenodeids, nodes_falsenodeids=nodes_falsenodeids, nodes_missing_value_tracks_true=[0] * n_nodes, nodes_hitrates=[1.0] * n_nodes, class_treeids=class_treeids, class_nodeids=class_nodeids, class_ids=[0] * len(class_weights), class_weights=class_weights, classlabels_int64s=[0, 1], base_values=[-0.405], # logit(0.4) post_transform="LOGISTIC", **node_kwargs, ) graph = helper.make_graph([node], "test", [X], [label_out, prob_out]) return helper.make_model(graph, opset_imports=[ helper.make_opsetid("", ... </details> <!-- START COPILOT CODING AGENT SUFFIX --> - Fixes microsoft/onnxruntime#27533 <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: xadupre <22452781+xadupre@users.noreply.github.com> Co-authored-by: Xavier Dupré <xadupre@microsoft.com> Co-authored-by: Xavier Dupré <xadupre@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Author
Parents
Loading