Reimplement torch::flip based on advanced indexing (#56713)
Summary:
## Rationale
This PR improves the performance of `torch::flip` by using `TensorIterator` as the same fashion as using `AdvancedIndexing`. Which means that this implementation is semantically equivalent to indexing a tensor using reverse indices `A[dim0_size - 1:0 ..., dimN_size-1:0, ...]`.
## Benchmark results
The following benchmark compares the runtime of this implementation of `flip` against the current implementation, AdvancedIndexing with reversed indices, as well as OpenCV one. The comparison scenarios consider a 4D tensor `[B, C, H, W]`, where the dimensions flipped correspond to `H` (vertical flip) and `W` (horizontal flip) under float32 and uint8 datatypes.
The benchmark implementation details can be found in https://github.com/andfoy/flip-benchmarks/blob/main/5_Stable_implementation/benchmarks.py. Additionally, there are correctness tests against the current flip implementation in https://github.com/andfoy/flip-benchmarks/blob/main/5_Stable_implementation/main.cpp, which tests against different layouts, datatypes and contiguous/non-contiguous tensors.
The following plots correspond to the means of the runtime of each operator after 100 samples. As it is possible to observe, the latest implementation of flip has a runtime similar to the indexing one. Also, the performance gains are up to 6X under some scenarios.
### Horizontal flip (float)
data:image/s3,"s3://crabby-images/193c2/193c23875a65fc9e83ffc8f57675ee48c78699e8" alt="bokeh_plot"
### Horizontal flip (uint8)
data:image/s3,"s3://crabby-images/efe40/efe40c6365d354a3ad196485f897becb5a2d4676" alt="bokeh_plot(1)"
### Vertical flip (float)
data:image/s3,"s3://crabby-images/b3a05/b3a057020238a5eb0b1ca64c3be4f036a66c597f" alt="bokeh_plot(2)"
### Vertical flip (uint8)
data:image/s3,"s3://crabby-images/d11c9/d11c91aae59cad63f683b8da5aa8874a54243cc1" alt="bokeh_plot(3)"
cc fmassa vfdev-5
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56713
Reviewed By: datumbox
Differential Revision: D28255088
Pulled By: fmassa
fbshipit-source-id: 5b8684812357c331e83a677b99cf0d78f0821678