[caffe2] L2 regularization for fused sparse Adagrad (#37652)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37652
Add weight_decay to fused adagrad operators. This should be landed with the next diff together. Just separating out to make review easier.
Test Plan: CI
Reviewed By: jspark1105
Differential Revision: D21320243
fbshipit-source-id: 1157471988dedd60ba9b62949055f651b1fa028f