enable deterministic path for index_put with accumulate=False on CPU and CUDA (#57839)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57839
we reuse the `index_put_accum_kernel`, rename it to `index_put_deterministic_kernel` and add a bool `accumulate` in `index_backward_kernel`
Test Plan:
buck test mode/opt //caffe2/test:torch -- test_index_put_non_accumulate_deterministic
✓ Pass: caffe2/test:torch - test_index_put_non_accumulate_deterministic_cpu (test_torch.TestTorchDeviceTypeCPU) (5.120)
Summary
Pass: 1
Skip: 1
↻ caffe2/test:torch - test_index_put_non_accumulate_deterministic_meta (test_torch.TestTorchDeviceTypeMETA)
ListingSuccess: 1
buck test mode/opt //caffe2/test:torch_cuda -- test_index_put_non_accumulate_deterministic
✓ ListingSuccess: caffe2/test:torch_cuda - main (6.397)
✓ Pass: caffe2/test:torch_cuda - test_index_put_non_accumulate_deterministic_cuda (test_torch.TestTorchDeviceTypeCUDA) (26.030)
✓ Pass: caffe2/test:torch_cuda - main (26.030)
Summary
Pass: 2
ListingSuccess: 1
Reviewed By: ngimel
Differential Revision: D28290699
fbshipit-source-id: df8bbe7af2e72017566161b05b85737fda4ceb3f