CUDA: add expert reduce kernel (#16857)
* CUDA: add expert reduce kernel
* contigous checks, better formatting, use std::vector instead of array
* use vector empty instead of size
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>