OpInfo : index_fill (port remaining method_tests) (#57009)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53237
Before PR (around 90s) (most time consuming tests in details)
<details>
```
pytest test/test_ops.py -k _index_fill --durations=20
========================================================================= test session starts ==========================================================================
platform linux -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
plugins: hypothesis-5.38.1
collected 19327 items / 19225 deselected / 102 selected
test/test_ops.py s..................ssssssssssssssssssss..................ss....ssssssssssssssss....sssss....ssssss.... [100%]
=========================================================================== warnings summary ===========================================================================
========================================================================= slowest 20 durations =========================================================================
44.14s call test/test_ops.py::TestGradientsCUDA::test_fn_gradgrad_index_fill_cuda_complex128
13.08s call test/test_ops.py::TestGradientsCPU::test_fn_gradgrad_index_fill_cpu_complex128
7.36s call test/test_ops.py::TestGradientsCUDA::test_fn_grad_index_fill_cuda_complex128
4.20s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_jit_index_fill_cuda_float32
3.42s call test/test_ops.py::TestCommonCPU::test_variant_consistency_jit_index_fill_cpu_float32
2.93s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_jit_index_fill_cuda_complex64
2.32s call test/test_ops.py::TestGradientsCPU::test_fn_grad_index_fill_cpu_complex128
2.18s call test/test_ops.py::TestCommonCPU::test_variant_consistency_jit_index_fill_cpu_complex64
1.03s call test/test_ops.py::TestOpInfoCUDA::test_duplicate_method_tests_index_fill_cuda_float32
0.84s call test/test_ops.py::TestGradientsCUDA::test_fn_grad_index_fill_cuda_float64
0.64s call test/test_ops.py::TestGradientsCUDA::test_fn_gradgrad_index_fill_cuda_float64
0.41s call test/test_ops.py::TestOpInfoCUDA::test_supported_backward_index_fill_cuda_complex128
0.41s call test/test_ops.py::TestOpInfoCUDA::test_supported_backward_index_fill_cuda_bfloat16
0.39s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_complex64
0.38s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_eager_index_fill_cuda_float32
0.36s call test/test_ops.py::TestOpInfoCUDA::test_supported_backward_index_fill_cuda_complex64
0.36s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_float16
0.35s call test/test_ops.py::TestOpInfoCUDA::test_supported_backward_index_fill_cuda_float16
0.35s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_int16
0.35s call test/test_ops.py::TestOpInfoCUDA::test_supported_backward_index_fill_cuda_float32
======================================================================= short test summary info ========================================================================
=============================================== 52 passed, 50 skipped, 19225 deselected, 8 warnings in 97.31s (0:01:37) ================================================
```
</details>
After PR (around 90s) (most time consuming tests in details)
<details>
```
pytest test/test_ops.py -k _index_fill --durations=20
========================================================================= test session starts ==========================================================================
platform linux -- Python 3.8.6, pytest-6.1.2, py-1.9.0, pluggy-0.13.1
plugins: hypothesis-5.38.1
collected 19327 items / 19225 deselected / 102 selected
test/test_ops.py s..................ssssssssssssssssssss..................ss....ssssssssssssssss....sssss....ssssss.... [100%]
=========================================================================== warnings summary ===========================================================================
========================================================================= slowest 20 durations =========================================================================
40.88s call test/test_ops.py::TestGradientsCUDA::test_fn_gradgrad_index_fill_cuda_complex128
13.12s call test/test_ops.py::TestGradientsCPU::test_fn_gradgrad_index_fill_cpu_complex128
7.03s call test/test_ops.py::TestGradientsCUDA::test_fn_grad_index_fill_cuda_complex128
3.48s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_jit_index_fill_cuda_complex64
3.01s call test/test_ops.py::TestCommonCUDA::test_variant_consistency_jit_index_fill_cuda_float32
2.55s call test/test_ops.py::TestCommonCPU::test_variant_consistency_jit_index_fill_cpu_complex64
2.43s call test/test_ops.py::TestGradientsCPU::test_fn_grad_index_fill_cpu_complex128
2.38s call test/test_ops.py::TestCommonCPU::test_variant_consistency_jit_index_fill_cpu_float32
1.10s call test/test_ops.py::TestOpInfoCUDA::test_duplicate_method_tests_index_fill_cuda_float32
0.76s call test/test_ops.py::TestGradientsCUDA::test_fn_grad_index_fill_cuda_float64
0.67s call test/test_ops.py::TestGradientsCUDA::test_fn_gradgrad_index_fill_cuda_float64
0.50s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_bfloat16
0.50s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_uint8
0.49s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_float64
0.49s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_float16
0.49s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_complex128
0.49s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_bool
0.49s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_float32
0.49s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_int32
0.48s call test/test_ops.py::TestOpInfoCUDA::test_supported_dtypes_index_fill_cuda_complex64
======================================================================= short test summary info ========================================================================
=============================================== 52 passed, 50 skipped, 19225 deselected, 8 warnings in 93.31s (0:01:33) ================================================
```
</details>
TODO:
* [x] Add test timings (Before and After)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57009
Reviewed By: H-Huang
Differential Revision: D28027095
Pulled By: mruberry
fbshipit-source-id: 6509ff726c8d954171cc0921b803ba261091a0e9