Optimize MomentumSGDUpdate maximum block size and make it templated
Summary: Removing the maximum number of blocks limit from the operator and making the nesterov parameter templated to remove branching.
Reviewed By: BIT-silence
Differential Revision: D14567003
fbshipit-source-id: 394c2039ee214adc6ccd2e562e4e9563d307131f