[torch] Add backward support for segment reduce (CPU only)
Summary:
This is to setup boiler plate code for backward and CPU implementation.
Next Steps in order:
- Add backward support for CUDA
- Add support for more aggregation types
- Benchmarking (for cuda mainly)/more testing/documentation
- Support for multi dimension
Test Plan:
Updated unit test to also check correctness of backward.
Wait for CI signal
Reviewed By: ngimel
Differential Revision: D27970340
fbshipit-source-id: 3e608c7fe3628b0a761dd8affc6aad8f65a6ef7f