[caffe2] Extend dedup SparseAdagrad fusion with stochastic rounding FP16 (#43124)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43124
Add the stochastic rounding FP16 support for dedup version of SparseAdagrad fusion.
ghstack-source-id: 111037723
Test Plan:
```
buck test mode/dev-nosan //caffe2/caffe2/fb/net_transforms/tests:fuse_sparse_ops_test -- 'test_fuse_sparse_adagrad_with_sparse_lengths_sum_gradient \(caffe2\.caffe2\.fb\.net_transforms\.tests\.fuse_sparse_ops_test\.TestFuseSparseOps\)' --print-passing-details
```
https://our.intern.facebook.com/intern/testinfra/testrun/5629499566042000
```
buck test mode/dev-nosan //caffe2/caffe2/fb/net_transforms/tests:fuse_sparse_ops_test -- 'test_fuse_sparse_adagrad_with_sparse_lengths_mean_gradient \(caffe2\.caffe2\.fb\.net_transforms\.tests\.fuse_sparse_ops_test\.TestFuseSparseOps\)' --print-passing-details
```
https://our.intern.facebook.com/intern/testinfra/testrun/1125900076333177
Reviewed By: xianjiec
Differential Revision: D22893851
fbshipit-source-id: 81c7a7fe4b0d2de0e6b4fc965c5d23210213c46c