[sr] remove max_indices argument of embedding_bag when unncessary (#75993)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75993
Strobelight shows copy_ in embedding_bag taking up a lot of time in adfinder_story_post_ad_session_exit_model 334827604_0
{F723683014}
More details in https://fb.quip.com/MKumAjz1YD4 (https://github.com/pytorch/pytorch/commit/1f47a80e8846fa367de36e7fe58b9463678adf5f)a#temp:C:FPD3 (https://github.com/pytorch/pytorch/commit/ecd556798087ee9cdf8ac80c721ff9b55c5f8cea)e5a0871ae5d481286b511ef7
The last 3 outputs of embedding_bag are unused in the graph: P495814049.
* max_indices output isn't necessary for the main output, so remove it when it's not used in the graph.
* offset2bag is used as an intermediate to calculate the main output, so we don't remove this output even though it's unused in the graph.
* bag_size is used as an intermediate to calculate the main output for MODE_MEAN, so we don't remove this for now.
Test Plan:
`./caffe2/caffe2/fb/predictor/scripts/run_disagg_model_benchmarks.sh 334827604 0 /data/users/ansha/tmp/ads_tail sr_only`
Inputs uploaded to `/mnt/persistent-public/ansha/ads_tail/334827604`
Before:
I0414 10:53:12.261133 1070948 PyTorchPredictorBenchLib.cpp:305] PyTorch run finished. Milliseconds per iter: 0.121318. Iters per second: 8242.78
0.11156 ms. 99.0457%. aten::embedding_bag (52 nodes, out variant)
After:
I0418 13:05:10.837378 2354604 PyTorchPredictorBenchLib.cpp:305] PyTorch run finished. Milliseconds per iter: 0.0881273. Iters per second: 11347.2
0.0789221 ms. 98.7096%. static_runtime::embedding_bag (52 nodes, out variant)
* Ads prod canary:
https://www.internalfb.com/intern/ads/canary/443002539593035806/
* 4M test: `servicelab create cogwheel_pyper_inference_fullsync_ads_inline_cvr_post_imp -a D35726594`
https://www.internalfb.com/intern/servicelab/602875732/
* 4M test: `servicelab create cogwheel_pyper_inference_fullsync_ads_10x_ctr_mbl_feed_non_mimo -a D35726594`
https://www.internalfb.com/intern/servicelab/1002874745/
Reviewed By: mikeiovine
Differential Revision: D35726594
fbshipit-source-id: 3b71a0822657bf7a23ce37ca899baef9997b011a
(cherry picked from commit fd5e3098c047a1e7d4348e1c97341eecb892536e)