Support focal loss in MTML
Summary:
[Not in need of review at this time]
Support focal loss in MTML (effectively dper2 in general) as described in https://arxiv.org/pdf/1708.02002.pdf. Adopt approach similar to Yuchen He's WIP diff D14008545
Test Plan:
Passed the following unit tests
buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_lr_loss_based_focal_loss
buck test //caffe2/caffe2/fb/dper/layer_models/tests:mtml_test_2 -- test_mtml_with_lr_loss_based_focal_loss
buck test //caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_lr_loss_based_focal_loss_with_stop_grad_in_focal_factor
Passed ./fblearner/flow/projects/dper/canary.sh; URL to track workflow runs: https://fburl.com/fblearner/446ix5q6
Model based on V10 of this diff
f133367092
Baseline model
f133297603
Protobuf of train_net_1 https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GEq30QIFW_7HJJoCAAAAAABMgz4Jbr0LAAAz
Reviewed By: hychyc90, ellie-wen
Differential Revision: D16795972
fbshipit-source-id: 7bacae3e2255293d337951c896e9104208235f33