Histogram Binning Calibration

Commit

4 years ago

Histogram Binning Calibration Summary: Adding a calibration module called histogram binning: Divide the prediction range (e.g., [0, 1]) into B bins. In each bin, use two parameters to store the number of positive examples and the number of examples that fall into this bucket. So we basically have a histogram for the model prediction. As a result, for each bin, we have a statistical value for the real CTR (num_pos / num_example). We use this statistical value as the final calibrated prediction if the pre-cali prediction falls into the corresponding bin. In this way, the predictions within each bin should be well-calibrated if we have sufficient examples. That is, we have a fine-grained calibrated model by this calibration module. Theoretically, this calibration layer can fix any uncalibrated model or prediction if we have sufficient bins and examples. It provides the potential to use any kind of training weight allocation to our training data, without worrying about the calibration issue. Test Plan: buck test dper3/dper3/modules/calibration/tests:calibration_test -- test_histogram_binning_calibration buck test dper3/dper3_models/ads_ranking/tests:model_paradigm_e2e_tests -- test_sparse_nn_histogram_binning_calibration All tests passed. Example workflows: f215431958 {F326445092} f215445048 {F326445223} Reviewed By: chenshouyuan Differential Revision: D23356450 fbshipit-source-id: c691b66c51ef33908c17575ce12e5bee5fb325ff

Author

Yangxin Zhong

Committer

facebook-github-bot

Parents

ac1f471f

pytorch 514f20ea - Histogram Binning Calibration

pytorch
514f20ea - Histogram Binning Calibration