Implementation of cosine learning rate training policy (#29017)
Summary:
Implementation of the cosine learning rate in: https://arxiv.org/pdf/1608.03983.pdf.
Mostly inspired from:
https://github.com/pytorch/fairseq/blob/master/fairseq/optim/lr_scheduler/cosine_lr_scheduler.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29017
Test Plan:
buck test -v 2 caffe2/caffe2/fb/dper/layer_models/tests/split_1:sparse_nn_test -- test_composite_cosine_lr_policy
learning rate log with max_lr=0.3, initial_period=20, t_mult=0.95, lr_shrink=0.95: P120327179
https://pxl.cl/PrcP
full canary: https://fburl.com/fblearner/mw69ylsd
Differential Revision: D18195868
Pulled By: grantlj
fbshipit-source-id: 67bdb0b8dd31d040d16b29d0da3115907bd141ef