[torch] Add cuda support for segment reduction 'max' (#54175)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54175
Building on top of previous PR. This PR adds cuda support for 1D max reduction.
Next steps:
- Add support for other major reduction types (e.g. min, sum) for 1D tensor
- Documentation for the op
- Perf optimizations and benchmark util
- Backward support (not high priority)
- Support for multi dimensional tensors (on data and lengths) (not high priority)
- Support for 'indices' (not high priority)
Test Plan: Added unit test
Reviewed By: ngimel
Differential Revision: D27121170
fbshipit-source-id: 1c2565f42e2903e6fc089d56983ce8857efbfa3c